Though the phrase has now been removed from its marketing materials, GitHub used to call itself a tool for "social coding." This idea is still central to the services GitHub provides—intimate access to the social layer inside of GitHub through the Activity API.
In this chapter we’ll investigate the Activity API by extending a chat robot. You might find it odd that a robot, generally considered an antisocial invention despite all best attempts, would play nicely with a social API, but this is a social robot. GitHubbers use an extensible chat robot called Hubot to record and automate their tasks, and to have fun on the Internet. If there were any robot suited for interacting with the GitHub Activity API, it’s Hubot, described on the site https://hubot.github.com/ as "a customizable, kegerator-powered life embetterment robot."
The Activity API includes:
-
Notifications (comments issued to users through various events)
-
Stargazing tools (Facebook has "likes" while GitHub has "stars" to indicate approval or interest)
-
Watching (a way to track GitHub data)
-
Events (a higher-level activity stream useful for following actions of users)
Note
|
The Activity API section also includes feeds. While feeds are grouped within the Activity API, they are not programmatic in the same way an API is, and we won’t cover them in depth here. Feeds are actually Atom feeds and not interactive beyond that. Atom feeds are similar to RSS feeds: a static feed you can subscribe to with an Atom client. |
We are going to build an extension to Hubot. When we are done, Hubot will be transformed into a robot that…
-
listens for pull request events from GitHub by subscribing to notifications using the GitHub Activity API;
-
invites people in the chat room to comment on those pull requests;
-
guarantees that communication between it and GitHub is securely delivered (with an unfortunate bug as caveat);
-
retrieves vital information from an external service (the Slack API);
-
has functionality fully described by automated tests;
-
allows easy simulation of inputs and outputs that map to the inputs and outputs it gets from APIs and services; and
-
runs with ease on a major PaaS (Heroku).
Hubot provides the skeleton for our chat robot. We’ll add the preceding functionality to Hubot and see how easy it is to combine these features into a coherent whole that solves a real problem.
If you want stability with your Hubot, you need to host it on a server. Hubot is written in NodeJS and requires a hosting service that supports NodeJS. Our Hubot needs to sit on a public IP address (not inside the firewall) because we receive notifications from GitHub. It is not strictly required that you host Hubot on a public server; if your Hubot does not need to receive requests from the outside world, you can host on a private internal server as well.
The simplest and cheapest hosting service for Hubot is Heroku. Once we generate our Hubot, we can simply do a git-push into Heroku to publish our chat robot for free. We’ll show these steps later in the chapter.
Hubot works with many chat endpoints. Your Hubot can connect to almost any popular chat service or protocol: IRC, XMPP, and many commercial services like Gchat, Basecamp, and even Twitter. Slack is a relatively new entrant into the world of chat services, but despite its youth, the Slack API is solid and connecting third-party clients to the Slack service is simple and straightforward. We’ll use Slack as our chat endpoint.
Now let’s create our Hubot and configure it to use Slack.
To build a Hubot you will need a working NodeJS installation, as specified in [appendix]. The following commands create a directory with a barebones Hubot:
$ npm install -g generator-hubot # (1) $ mkdir slacker-hubot # (2) $ cd slacker-hubot/ $ yo hubot # (3) $ npm install hubot-slack --save # (4)
You may not be familiar with these commands, so let’s go over the important ones.
-
npm is the tool that installs packages for NodeJS (documented in [appendix]). The
npm install -g generator-hubot
command installs a command-line tool called yeoman and a plug-in for yeoman that scaffolds Hubot. -
You should create a new directory and enter it so that when you create your Hubot you can store it entirely in its own space.
-
You run the generator using the
yo hubot
command. This builds out the set of files for a minimal Hubot. -
We then install the slack adapter and save the package to the package.json file.
Now that we have a simple Hubot created we need to create the Slack site where our Hubot will live.
Going to https://slack.com/ starts the process of creating your own Slack site. You’ll need to step through creating an account. Slack sites are segmented by organization, and you’ll want to establish a URL prefix for your Slack site. Typically this is the name of your organization.
Once you have your slack site created, you need to create a channel as in Creating a channel from the Slack sidebar.
You can name the channel anything you want, but it is often a good mnemonic to use a name that suggests this is a channel where more serious work gets done. You could use a name like "PR Discussion" to indicate this is the channel where PRs are discussed. To keep things simple, we will use the name "#general." Once you click the link to create a channel, you’ll see a popup asking for the name and an optional description. After you have created the channel, you will see a link to "Add a service integration" as shown in Adding service integrations to Slack.
Slack supports many different service integrations, and one of them is Hubot as shown in Service integration options for Slack.
Choosing Hubot takes you to a settings screen for your Hubot integration.
Slack automatically generates an authentication token for you. This token is used to verify the connection from your Hubot. This token can be revoked, and in fact the token from Hubot configuration page for Slack has been revoked and can no longer be used to authenticate into Slack. If you ever accidentally publicize this token, you can easily revoke and reassign a token to your Hubot on this screen.
You will also need to specify a name. Use "probot" and if you’d like, change the avatar associated with the Hubot (these options are shown in Hubot configuration page for Slack).
Make sure you save your integration before continuing.
Eventually you will want to run your Hubot on a server, but Hubot can run from a laptop behind a firewall as well. At the beginning of development, while testing and developing your bot and the changes are fast and furious, you probably want to run Hubot locally. In fact, Hubot behind a firewall is almost identical in its feature set with one major exception: anything behind the firewall is inaccessible, obviously, to external services. We are eventually going to be configuring GitHub to send events to us when a pull request is created, and Hubot behind the firewall cannot receive those events. But, for almost all other functionality, running Hubot locally speeds up development cadence.
To run your bot locally, make sure you specify the variables on the command line:
$ HUBOT_SLACK_TOKEN=xoxb-3295776784-nZxl1H3nyLsVcgdD29r1PZCq \ ./bin/hubot -a slack
This command runs the Hubot script with the Slack adapter. The Slack adapter knows how to interact with the Slack.com service. It requires an authentication token, and this is provided via the environment variable at the beginning of the line.
Your bot should be set up and waiting in the #general room inside your
Slack site. Go to the #general room. Then, you can test that Hubot
is properly connectd by typing in the name of your Hubot
and then a command like the rules
. For example, if our Hubot is
named probot, then we would type probot the rules
, which then displays the following conversation as shown in Hubot’s built-in repartee.
We see that our Hubot printed out the rules it abides by (published originally by Isaac Asimov in his "Runaround" short story in 1942).
Hubot out-of-the-box supports many commands. To get a list, type help to see a list like that shown in Listing the Hubot vocabulary.
The pug me
command is a favorite. Many people new to Hubot
quickly get sucked into spending hours looking at cute pictures of
pugs. Beware!
Now that we’ve successfully started our Hubot locally, we can move it to Heroku and keep it running even when our laptop is turned off.
Heroku requires registration before using it. Heroku offers free plans and everything we’ll do here can be done using one of them. Once you have created an account, install the Heroku toolbelt found here: https://toolbelt.heroku.com/. The toolbelt provides a set of tools useful for managing Heroku applications. You will need to have Ruby set up as explained in [introduction].
If your chatbot is working per the instructions given in the previous
section, then it is almost ready to deploy to Heroku. You’ll need to
add the same environment variable using the Heroku tools. In addition
to the authentication token for Slack, you will need to configure a
URL for your site. Heroku will generate a URL for you from the name of
your project (in this case inqry-chatbot
); so as long as the name has
not been claimed already by someone else, you can name it as you will:
$ heroku create inqry-chatbot $ heroku config:add HEROKU_URL=https://inqry-chatbot.herokuapp.com/ $ heroku config:add HUBOT_SLACK_TOKEN=xxbo-3957767284-ZnxlH1n3ysLVgcD2dr1PZ9Cq $ git push heroku master Fetching repository, done. Counting objects: 5, done. Delta compression using up to 8 threads. Compressing objects: 100% (3/3), done. Writing objects: 100% (3/3), 317 bytes | 0 bytes/s, done. Total 3 (delta 2), reused 0 (delta 0) -----> Node.js app detected -----> Requested node range: 0.10.x ... -----> Compressing... done, 6.8MB -----> Launching... done, v9 https://inqry-chatbot.herokuapp.com/ deployed to Heroku To [email protected]:inqry-chatbot.git d32e2db..3627218 master -> master
If you need to troubleshoot issues with your Hubot, you can always run
the heroku log command to view logs for your application, heroku logs -t
:
$ heroku logs -t 2014-11-18T07:07:18.716943+00:00 app[web.1]: Successfully 'connected' as hubot 2014-11-18T07:07:18.576287+00:00 app[web.1]: Tue, 18 Nov 2014 07:07:18 GMT connect deprecated limit: Restrict request size at location of read at node_modules/hubot/.../express/.../connect/.../middleware/multipart.js:86:15 ...
When you send commands into your chat room you will notice events inside of Heroku. This is a good way to verify that your bot is wired into Slack properly.
You might also want to publish this repository into GitHub. Heroku, as a part of hosting your live application, also hosts the full Git repository of your Hubot (Hubot, as friendly as it tries to be, is just another NodeJS application in the end). Heroku can host the entirety of the source code for your Hubot for you, but does not have the additional tools, like user management, that GitHub does. For this reason, use your GitHub account as your code repository, the place where team members develop new features of your chatbot. Build and test locally, and then push into Heroku using the ease of the Git workflow as a deployment layer.
Now that we have created and installed Hubot, let’s look at the Activity API and determine how we want to code our extension.
The Activity API centers around notifications: notifications are similar to the notifications you see on social networking sites, events that occur that document important points of interest inside a timeline of activity. GitHub activity events are often tied to important milestones inside of a developer’s day, activities like pushing commits into the main remote repository, asking questions on discussion threads associated with a repository, or assigning issues to a developer for review.
These notifications are accessible to team members without programmatically accessing the GitHub API. Team members are notified of events inside of their workflow using email based on several rules. GitHub will automatically send out notification emails when a user has watched a repository and issues or comments are added, a pull request is made, or there are comments made on a commit. In addition, even if a user has not watched a repository, they will be notified if that user is @mentioned (prefixing the @ character to a team member’s name inside a comment), when an issue is assigned to them, or when that user participates in a discussion associated with any repository.
The GitHub policy for notification is definitely to err on the side of being overly verbose. Many people live in their email, and making sure that all important activities are distributed to the right people involved makes sense. GitHub has a good set of rules for making sure the correct notifications get to the right parties.
Email does falter as a to-do list, however, and at times the ease in which email can be delivered breeds a secondary problem: overwhelm. It can be very easy to lose focus (vital to building software) when you are constantly context switching by checking email, and notifications can often fly by. In addition, email is privately directed and prevents easy collaboration; generally people don’t share email inboxes. Let’s extend our Hubot to help us resolve these problems by taking our GitHub notifications into a shared and "opt-in when you are logged-in" communication channel.
Hubot extensions are written in either JavaScript or CoffeeScript. CoffeeScript is a intermediate language that compiles directly to JavaScript. Many people prefer writing in CoffeeScript because it has a cleaner syntax and writes "safer" JavaScript (the syntax helps you avoid common tricky pitfalls in the JavaScript language, like what "this" refers to). CoffeeScript is an indentation-based language (much like Python), and after the initial learning curve, can feel easier to read than JavaScript, especially when you have many nested function callbacks (common in JavaScript programming); it is easier to see where a function begins and ends given the indentation levels. Hubot is itself written in CoffeeScript, and we’ll write our extension in CoffeeScript as well.
Note
|
CoffeeScript is a language where indentation is important. For readability purposes, when we display a snippet of code from a longer file, there are times where we have changed the indentation of that snippet and removed the initial indentation. If you were to copy the code without realignment, the snippet would not work until you reindented it to fit the context into which it sits. |
The Hubot extension module format is exceedingly simple. You write
JavaScript modules (using the export
syntax) and Hubot passes you in
a robot object you program using several API methods.
There are a few concepts useful to programming Hubot. You can find an example of each of these methods inside the example.coffee file inside the scripts directory:
-
Hubot has a "brain." This is an internal state object, which means these values persist across chat messages. This state is not persisted into a database by default, so this state is not restored if you restart Hubot. However, a persistence mechanism is exposed via Redis, though this is optional and requires configuration. The brain is the way you set and get values that are saved across discrete messages.
-
Hubot has different response mechanisms. They can choose to respond only when they hear exact phrases or when keywords are found in any message, and you don’t need to do the grunt work inside your code to determine the differences between these communication types.
-
Hubot includes an HTTP server. You might need your Hubot to accept requests from additional services beyond the chat service, and Hubot makes it easy to accept these kinds of requests.
-
Hubot has a built-in HTTP client. You can easily access HTTP resources within Hubot; many popular extensions to Hubot access a web service when Hubot receives a request.
-
Hubot commands can include parameters. You can tell a Hubot to do something multiple times and write a generic function that accepts options.
-
Hubot can handle events. Each chat service has a generalized set of events that are normalized to a common API. Hubot can be programmed to interact with these events. For example, Hubot can perform actions when a room topic changes or when users leave rooms.
-
Hubot can handle generic errors at the top level. Hubot can be programmed with a catch-all error handler so that no matter where your code failed, you can catch it without crashing your bot.
Hubot will use the first five of these features:
-
We will use the Hubot brain to store a PR review request. If Hubot asks a user to review a PR, it needs to keep track of this so that when the user responds it has some context of the request.
-
We will use the respond method to program our Hubot to handle a request when a user accepts or declines the review request.
-
We will use the HTTP server to accept PR notifications from GitHub webhooks.
-
We will use the HTTP client to get a list of users from Slack.
-
We will use the parameterization of requests to Hubot to retrieve the specific pull request ID from a chat user message.
There are examples of the other two features (events and generic errors) inside the examples script that ship with the Hubot source code but we won’t use those APIs in our Hubot.
As we’ve seen in other chapters, pull requests are the mechanism used on GitHub to easily integrate code changes into a project. Contributors either fork the master repository and then issue a pull request against that repository, or, if they have write permission to the main repository, make a "feature" branch and then issue a pull request against the "master" branch.
Pull requests often come with a chat message indicating several people who should review the request. This tribal knowledge about who should be involved is only in the head of the developer who created the code. It could be that they invited the correct people. Or, it could be that they invited the people they prefer to review their code for various (and completely rational reasons). This can be an effective way to engage the right people around a new piece of code.
And inviting reviewers this way can have downsides as well: if the person is otherwise engaged, pull requests can linger when a notification email goes unread. And, there is good research to indicate that the best performing teams are those who share all tasks and responsibilities equally. It often does not scale to ask everyone to participate in all code reviews associated with a pull request. But it might be the case that randomly selecting developers involved in a project is a better (and more efficient) way to review code than asking the developer who created the code to determine these people.
Hubot will assign active chat room users to do code reviews when a new pull request is created. We will use the GitHub Activity API to subscribe to pull request events. When Hubot becomes aware that a pull request needs review, it will randomly assign a user in the chat room to do the review and then ask that user if they want to accept the challenge. If they accept, we will note that in the pull request comments.
We will start writing our extension by defining the high-level communication format we expect from our users. Our script has a simple vocabulary: look for responses indicating acceptance or refusal of our review requests. Our extension script should be in the scripts directory and named pr-delegator.coffee. This is just the back and forth we will be having with users; we are not yet writing any code to handle the pull request notifications:
module.exports = (robot) -> # (1)
robot.respond /accept/i, (res) -> # (2)
accept( res )
robot.respond /decline/i, (res) -> # (3)
decline( res )
accept = ( res ) -> # (4)
res.reply "Thanks, you got it!"
console.log "Accepted!" # (5)
decline = ( res ) -> # (6)
res.reply "OK, I'll find someone else"
console.log "Declined!"
This is a dense piece of code and can be confusing if you are new to CoffeeScript. At the same time, hopefully you will agree that this is amazingly powerful code for such a small snippet after reading these notes.
-
All NodeJS modules start by defining entrypoints using the
exports
syntax. This code defines a function that expects a single parameter; when the function is executed, the parameter will be called a robot. The Hubot framework will pass in a robot object for us that we will program further down. -
The Hubot API defines a method on the robot object called
respond
, which we use here. It takes two parameters: a regular expression to match against and a function that receives an instance of the chat response object (calledres
here). The second line uses the API for this response object to call a methodaccept
with the response object. We define accept in a moment. -
We setup a response matcher for a decline response.
-
Now we define the
accept
method. The accept method receives the response object generated by the Hubot framework and calls thereply
method, which, you guessed it, sends a message back into the chat channel with the text "Thanks, you got it!" -
The accept method then also calls
console.log
with information that is displayed on the console from which we started Hubot. This is a simple way for us to assure everything worked correctly; if we don’t see this message, our code before this was broken. Theconsole.log
is not visible to any users in the channel. It is good practice to remove this code when you finalize your production code, but if you forget, it won’t affect anything happening in the channel. -
We then define the
decline
method using the same APIs as for theaccept
method.
If Hubot is running, you will need to restart it to reload any
scripts. Kill Hubot (using Ctrl-C), and then restart it, and then
play with commands inside your Slack site. Enter the commands
probot accept
and probot decline
and you’ll see Hubot
responding inside the channel. You’ll also see the message Accepted!
or
Declined!
printed to the console on which Hubot is
running.
Now that we have the basics of our Hubot working, let’s make sure we
certify our code with some tests. We’ll use the Jasmine testing
framework for NodeJS. It offers an elegant behavior-driven testing
syntax where you specify a behavior as the first parameter to an it
function, and as a second parameter, a function that is run as the
test itself. Jasmine manages running each it
call and displays a
nice output of passing and failed tests at the end of your
run. Jasmine tests are typically written in JavaScript, but the latest versions of
Jasmine support tests are also written in CoffeeScript. Hubot is written
in CoffeeScript, so let’s write our tests in CoffeeScript as
well. We need to put our tests inside a
directory called spec and make sure our filename ends with
.spec.coffee. Let’s use spec/pr-delegator.spec.coffee as the
complete filename. Jasmine expects spec files to have .spec. at the
end of their filename (before the extension, either .js or
.coffee); if your filename does not match this pattern Jasmine won’t
recognize it as a test.
Probot = require "../scripts/pr-delegator"
Handler = require "../lib/handler"
pr = undefined
robot = undefined
describe "#probot", ->
beforeEach () ->
robot = {
respond: jasmine.createSpy( 'respond' )
router: {
post: jasmine.createSpy( 'post' )
}
}
it "should verify our calls to respond", (done) ->
pr = Probot robot
expect( robot.respond.calls.count() ).toEqual( 2 )
done()
The first line in our test requires, or loads, the Hubot extension
module into our test script, giving us a function we save as a Probot
variable. We then create a describe
function, which is an organizing function to group tests. describe
functions take an indentifier (in this case #probot
) and a function
that contains multiple it
calls. In addition, a describe
function
can also contain a beforeEach
function that configures common
elements inside our it
calls; in this case we create a faked robot
object we will pass into our Probot
function call. When we are
running Hubot itself, Hubot creates the robot and passes it into the
Probot
function, but when we run our tests, we generate a fake one
and query it to make sure it is receiving the proper
configuration. If we make a change inside our actual Hubot code and
forget to update our tests to verify those changes, our tests will
fail and we’ll know we need to either augment our tests, or something
broke inside our robot, a good automated sanity check for us when we
are feverishly coding away, animating our helpful Hubot.
You should see some similarities between the calls made to our robot
(robot.respond
and robot.router.post
) and the tests. We set up
"spies" using Jasmine that generate fake function calls capable of
recording any interaction from outside sources (either our production
code or the test code harness). Inside our it
call, we
then verify that those calls were made. We use the expect
function
to verify that we made two calls to the respond
function
defined on the robot, and that robot.router.post
has been called as
well.
We need to install Jasmine, and we do this by adding to our
package.json file. Append "jasmine-node": "^1.14.5"
to the file,
and make sure to add a comma to the tuple above it. Adding this code
specifies that the minimum version of Jasmine node we will use is
"1.14.5".
...
"hubot-shipit": "^0.1.1",
"hubot-slack": "^3.2.1",
"hubot-youtube": "^0.1.2",
"jasmine-node": "^2.0.0"
},
"engines": {
...
Runing the following commands will then install Jasmine (the library and a test runner command-line tool) and run our tests. We abbreviate some of the installation output to save space:
$ npm install
...
[email protected] node_modules/hubot-slack
└── [email protected] ([email protected], [email protected], [email protected])
...
$ ./node_modules/.bin/jasmine-node --coffee spec/
.
Finished in 0.009 seconds
1 test, 1 assertions, 0 failures, 0 skipped
Our tests pass and we now have a way to document and verify that our code does what we think it does.
We are now in a position to start adding the actual functionality to our Hubot. Our first requirement is to register for pull request events. We could do this from within the GitHub website, but another way is to use the cURL tool to create the webhook from the command line. In order to do this, we need to first create an authorization token, and then we can use that token to create a webhook.
To create the token, run this command, setting the proper variables for your username instead of mine ("xrd"):
$ export USERNAME=xrd
$ curl https://api.github.com/authorizations --user $USERNAME --data
'{"scopes":["repo"], "note": "Probot access to PRs" }' -X POST
This call can return in one of three ways. If your username or password is incorrect, you will get an error response message like this:
{
"message": "Bad credentials",
"documentation_url": "https://developer.github.com/v3"
}
If your username and password are correct and you don’t have two-factor authentication turned on, the request will succeed and you will get back a token inside the JSON response:
{
"id": 238749874,
"url": "https://api.github.com/authorizations/9876533",
"app": {
"name": "Probot access to PRs",
"url": "https://developer.github.com/v3/oauth_authorizations/",
"client_id": "00000000000000000000"
},
"token": "fakedtoken1234",
"hashed_token": "fakedhashedtoken7654",
...
If you are using two-factor authentication then you will see a response message like this:
{
"message": "Must specify two-factor authentication OTP code.",
"documentation_url":
"https://developer.github.com/v3/auth#working-with-two-factor-authentication"
}
If you get this message in response to the prior cURL command, then you will be receiving a one-time password via your choice of a two-factor authentication alternative endpoint (either SMS or a two-factor authentication app like Google Authenticator or recovery codes that you printed out). If you use text messaging, check your text messages and then resend the request appending a header using cURL:
$ curl https://api.github.com/authorizations --user $USERNAME --data
'{"scopes":["repo"], "note": "Probot access to PRs" }' -X POST
--header "X-GitHub-OTP: 423584"
Enter host password for user 'xrd':
If all these steps complete successfully (regardless of whether you are using two-factor authentication or not) you will then receive an OAuth token:
{
"id": 1234567,
"url": "https://api.github.com/authorizations/1234567",
"app": {
"name": "Probot access to PRs (API)",
"url": "https://developer.github.com/v3/oauth_authorizations/",
"client_id": "00000000000000000000"
},
"token": "ad5a36c3b7322c4ae8bb9069d4f20fdf2e454266",
"note": "Probot access to PRs",
"note_url": null,
"created_at": "2015-01-13T06:23:53Z",
"updated_at": "2015-01-13T06:23:53Z",
"scopes": [
"notifications"
]
}
Once this is completed we now have our token we can use to
create a webhook. Make sure to use the correct repository name and
access token before running the cURL command. We will also need the
endpoint we created when we published into Heroku (in our case
https://inqry-chatbot.herokuapp.com
):
$ REPOSITORY=testing_repostory
$ TOKEN=ad5a36c3b7322c4ae8bb9069d4f20fdf2e454266
$ WEBHOOK_URL=https://inqry-chatbot.herokuapp.com/pr
$ CONFIG=$(echo '{
"name": "web",
"active": true,
"events": [
"push",
"pull_request"
],
"config": {
"url": "'$WEBHOOK_URL'",
"content_type": "form",
"secret" : "XYZABC"
}
}')
$ curl -H "Authorization: token $TOKEN" \
-H "Content-Type: application/json" -X POST \
-d "$CONFIG" https://api.github.com/repos/$USERNAME/$REPOSITORY/hooks
{
"url": "https://api.github.com/repos/xrd/testing_repostory/hooks/3846063",
"test_url":
"https://api.github.com/repos/xrd/testing_repostory/hooks/3846063/test",
"ping_url":
"https://api.github.com/repos/xrd/testing_repostory/hooks/3846063/pings",
"id": 3846063,
"name": "web",
"active": true,
"events": [
"push",
"pull_request"
],
"config": {
"url": "https://inqry-chatbot.herokuapp.com/pr",
"content_type": "json"
},
"last_response": {
"code": null,
"status": "unused",
"message": null
},
"updated_at": "2015-01-14T06:23:59Z",
"created_at": "2015-01-14T06:23:59Z"
}
There is a bit of bash cleverness here, but nothing to be overly
disturbed by. We create a few variables we use in the final
command. Since the $CONFIG variable is particularly long, we use echo
to print out a bunch of information with the webhook URL in the
middle. If you want to see the result of that variable, type echo
$CONFIG
and you’ll notice the snippet … "url":
"https://inqry-chatbot.herokuapp.com/pr" …
properly interpolated.
Here we use the Heroku API URL as our webhook endpoint. This means we need to have things hosted on Heroku for the webhook to talk to our HTTP server properly. We can do some things (like connecting the Hubot to the Slack service) from behind a firewall and have it talk with other chat room participants, but any webhook request will fail unless the chat client is running on a publicly available server.
Be careful to make sure you use the content_type
set to "form" (which
is the default, so you could leave it blank). Setting this to json
will
make it difficult to retrieve the raw body inside your Hubot when the
post request is received and validate the request using a secure
digest. We want to make sure all requests are real requests from GitHub
and not a cracker attempting to maliciously inject themselves into our
conversations. To protect from this possible situation, we verify each
request back into GitHub by using the secret generated
when we created the webhook. We’ll discuss this in detail later in this
chapter, but for now, establish a secret when you create the hook. A
cracker might be able to guess about where our endpoint exists, but
unless Heroku or GitHub is compromised, they won’t know our webhook secret.
We should update our tests to make sure we anticipate this new
functionality. We will be using the Hubot HTTP server, which
piggybacks on the built-in express server running inside of Hubot. Our
new test should reflect that we use the router.post
method exposed
to our Hubot, and that it is called once. We add this next test to the
end of our spec file:
it "should verify our calls to router.post", (done) ->
pr = Probot robot
expect( robot.router.post ).toHaveBeenCalled()
done()
This additional test will fail should we run it. Now we can add to our Hubot and have it handle webhook callbacks from GitHub. Add this to the end of the file:
robot.router.post '/pr', ( req, res ) ->
console.log "We received a pull request"
Now if we run our tests, they all pass. If they do, publish our new version of the app into Heroku. We’ll omit this step in the future, but if you want to receive pull requests on the router you have set up, remember that you need to publish your files into Heroku so the endpoint is public.
$ ./node_modules/.bin/jasmine-node --coffee spec/
..
$ git commit -m "Working tests and associated code" -a
...
$ heroku push
Finished in 0.009 seconds
2 tests, 2 assertions, 0 failures, 0 skipped
$ git push heroku master
Fetching repository, done.
Counting objects: 5, done.
Delta compression using up to 8 threads.
...
We now have an end-to-end Hubot setup, ready to receive webhook notifications.
We can now start testing our Hubot with real GitHub
notifications. First, let’s set up a repository we can use for
testing. Creating the new repository on GitHub is a quick task if we
use the hub
tool described in [Jekyll]:
$ mkdir testing_repository
$ cd testing_repository
$ git init
$ touch test.txt
$ git add .
$ git commit -m "Initial checkin"
$ hub create
...
Now we can create a real pull requests for our repository from the command line and test our Hubot. A typical pull request flow looks like the following:
-
Create a new branch
-
Add new content
-
Commit the content
-
Push the new branch into GitHub
-
Issue a pull request
All of this can be automated using a combination of Git commands and cURL. We’ve seen some of these commands before and can reuse the previous command-line invocations and variables we used when generating our webhook using the API via cURL. Our config variable is similar, but the required fields in this case are: the title and body for the pull request, the "head" key that matches the name of the branch, and where to merge it to using the "base" key.
Creating a new branch, adding some content, and then issuing a pull request against the branch might be something we need to do several (or more) times as we experiment and learn about the Hubot extension API. The examples here work right out of the box, but don’t be fooled into thinking that it all went exactly as we expected the first time. Given that, these are commands you might want to perform multiple times as you are experimenting, so let’s put the commands described in the previous paragraph into a bash script that is generic and can be run multiple times. We can call it issue-pull-request.sh and place the script inside the test directory:
# Modify these three variables
AUTH_TOKEN=b2ac1f43aeb8d73b69754d2fe337de7035ec9df7
USERNAME=xrd
REPOSITORY=test_repository
DATE=$(date "+%s")
NEW_BRANCH=$DATE
git checkout -b $NEW_BRANCH
echo "Adding some content" >> test-$DATE.txt
git commit -m "Adding test file to test branch at $DATE" -a
git push origin $NEW_BRANCH
CONFIG=$(echo '
{ "title": "PR on '$DATE'",
"body" : "Pull this PR'$DATE'",
"head": "'$NEW_BRANCH'",
"base": "master"
}' )
URL=https://api.github.com/repos/$USERNAME/$REPOSITORY/pulls
curl -H "Authorization: token $AUTH_TOKEN" \
-H "Content-Type: application/json" -X POST -d "$CONFIG" "$URL"
This script generates a unique string based on the current time. It
then creates and checks out a new branch based on that name, adds some
content to a unique file, commits it, pushes it into GitHub, and generates a
pull request using the API. All you will need to do is make a one-time
update to the three variables at the top of the script to match your
information. This script is resilient in that even if your auth token were incorrect (or
had expired) this command would do nothing other than add testing data
to your test repository, so you can experiment safely. Just be sure
to pay attention to whether you see a successful JSON request as shown
in the following code or an error message. And, as we are going to run this script as
a command, make it executable using the chmod
command.
Now, let’s run it and see what happens:
$ chmod +x ./issue-pull-request.sh
$ ./issue-pull-request.sh
{
"url": "https://api.github.com/repos/xrd/testing_repostory/pulls/1",
"id": 27330198,
"html_url": "https://github.com/xrd/testing_repostory/pull/1",
"diff_url": "https://github.com/xrd/testing_repostory/pull/1.diff",
"patch_url": "https://github.com/xrd/testing_repostory/pull/1.patch",
"issue_url": "https://api.github.com/repos/xrd/testing_repostory/issues/1",
"number": 1,
"state": "open",
"locked": false,
"title": "A PR test",
"open_issues_count": 1,
...
This returns a huge JSON response (abbreviated here), but you can see
the first item is a link to the pull request. For a human-readable
link, we should use the link called html_url
. Were we to visit this
link, we could merge the pull request from within the GitHub web UI.
To see more context on what is happening with this pull request, once we are looking at this pull request inside of GitHub, we can then navigate to the settings for our repository, follow the link to "Webhooks and Services" on the left navigation bar, and we will then find at the very bottom of the page a list of recent deliveries to our webhook, as in Recent failed deliveries from our webhook.
These requests all failed; our Hubot is not correctly configured to handle real HTTP requests from GitHub. This does show that GitHub is trying to do something when a pull request is received. We’ll work on getting our handler code written and pushed into Heroku, and then issue another PR.
Let’s build our HTTP handler when PR notifications arrive from
GitHub. At first glance, we might take the easy route, adding it
directly into the top-level script. But given the fact that
JavaScript handles events inside of callbacks and the fact that Hubot
extensions only export a single constructor (using the
module.exports
syntax), it is easier to create, and more importantly
test, a separate module, which we require in our main extension script.
We start by writing our tests. We’ve already created a test that
verifies the call to robot.router.post
. Our new functionality will
actually handle the PR notification, so let’s add a new grouping using
the describe syntax and call it "#pr". The new functionality is
simple: if the Hubot receives the proper parameters (most importantly
that the internal secret matches the secret sent on the request) then
we accept the PR as valid and message our room with further
instructions, namely inviting some user to review this pull
request. Our handler then needs to expose two methods:
prHandler
, which is where we delegate any information coming from an
HTTP request to the /pr
route, and a method where we can configure
the secret, which we call setSecret
. Once we have established this
internal signature for our handler library, we can add two simple
tests and then our library.
We have two tests: one that handles the correct flow and one that handles the incorrect flow. In a before block (this happens before each test) we set up a fake robot, and set the secret on our handler module. Our faked robot implements the same methods a real Hubot robot does (the messageRoom and send methods), but we create Jasmine spies to verify these functions are called inside our implementation code:
describe "#pr", ->
secret = "ABCDEF"
robot = undefined
res = undefined
beforeEach ->
robot = {
messageRoom: jasmine.createSpy()
}
res = { send: jasmine.createSpy() }
Handler.setSecret secret
it "should disallow calls without the secret", (done) ->
req = {}
Handler.prHandler( robot, req, res )
expect( robot.messageRoom ).not.toHaveBeenCalled()
expect( res.send ).toHaveBeenCalled()
done()
it "should allow calls with the secret", (done) ->
req = { body: { secret: secret } }
Handler.prHandler( robot, req, res )
expect( robot.messageRoom ).toHaveBeenCalled()
expect( res.send ).toHaveBeenCalled()
done()
Now, add a file called ./lib/handler.coffee:
_SECRET = undefined
exports.prHandler = ( robot, req, res ) ->
secret = req.body?.secret
if secret == _SECRET
console.log "Secret verified, let's notify our channel"
room = "general"
robot.messageRoom room, "OMG, GitHub is on my caller-id!?!"
res.send "OK\n"
exports.setSecret = (secret) ->
_SECRET = secret
As you can see, the Hubot API does a lot of work for us: it processes
the JSON POST request to the /pr
endpoint and provides us with the
parsed parameters inside the body object. We use that to retrieve the
secret from the request. Even if you have used CoffeeScript before,
you may not be familiar with the ?.
syntax: this just tests to see
if body
is defined and if so, has a key named secret
. This prevents
us from crashing if the secret is not sent in with the request. If the
secret from the request matches the configured secret, then we message
the room; otherwise we ignore the request. In either case, we need to
respond to the calling server by using the send
method (send
is
provided by the built-in express server Hubot uses to provide
an HTTP server). For debugging purposes we output that the secret
was validated, if it was in fact validated, but otherwise the behavior
of our response to the calling client is the same regardless of
whether they provided a correct secret or not. We don’t want to
provide an attacker with anything extra if they pass in an incorrect secret.
If we run our tests we will see them all pass:
$ node_modules/jasmine-node/bin/jasmine-node \
--coffee spec/pr-delegator.spec.coffee
....
Finished in 0.01 seconds
4 tests, 6 assertions, 0 failures, 0 skipped
Hubot will spawn the HTTP server wherever it runs so we can talk to it
on our local machine (though this will likely be inside a firewall and
inaccessible to GitHub), so we can test it using cURL
locally. Remember that our robot router accepts commands as HTTP POST
requests, so we need to specify a post request (using the --data
switch with cURL):
$ ( HUBOT_SLACK_TOKEN=xoxb-3295776784-nZxl1H3nyLsVcgdD29r1PZCq \
./bin/hubot -a slack 2> /dev/null | grep -i secret & )
$ curl --data '' http://localhost:8080/pr
Invalid secret
OK
$ curl --data 'secret=XYZABC' http://localhost:8080/pr
Secret verified
OK
$ kill `ps a | grep node | grep -v grep | awk -F ' ' '{ print $1 }'`
These commands verify that things are working properly. First, we
start the server, piping the output to grep
to constrain output
related to our secret processing (we also background the
entire chain using an ampersand and parentheses, a bash trick). Then,
we hit the server running locally without the secret: the server (as
it is running in the same shell) prints out the
message "Invalid secret" using console.log
, and then cURL prints out
"OK," which is what was returned from our server. If we run the command
again, this time including the secret as post parameters, we see that
Hubot verified the secret internally against its own secret, and then
cURL again prints "OK," which was what the express server inside of
Hubot returned to the calling client. The final line quits Hubot:
this command finds the PID for the Hubot client (which runs as a node
process) and then sends it a SIGHUP signal, signaling to Hubot that it
should quit.
Provided you connected correctly to your Slack site, you’ll also see a message inside your #general channel, which says "OMG, GitHub is on my caller-id!?!" We now have a simple way to trigger a pull request notification without going through the formality of actually generating a pull request. Between our script, which issues real pull requests through the GitHub API, and this one that fakes a webhook notification, we have the ability to test our code externally as we develop it. Of course, our tests are valuable, but sometimes it is impossible to understand what is happening inside of our Hubot without running against the real Hubot and not a test harness.
Now that we have an incoming pull request (albeit one we are faking), we need to write the code to find a random user and assign them to the pull request.
Warning
|
This next section is redundant; our Hubot will function
exactly as we need it to if you were to disregard any code from this
section. As I was writing this book, I mistakenly missed the fact that
the Hubot Initially I planned to remove this entire section. However, it does demonstrate the ease of using an external service through the built-in HTTP client, which is a powerful feature of Hubot. And it also demonstrates how powerful tests aid you when developing a Hubot extension; I was able to refactor to use a radically different internal code path for getting the list of users and maintain faith that the end-to-end process of my code works by refactoring and then fixing broken tests. And, though not important for this section per se, the Slack API provides much richer data on the users logged in to a room, which could be valuable in other situations. If you want to skip to the next section, you will have all the code to build our Hubot as we described earlier. But I think it is a worthwhile read for general Hubot understanding. |
To find a user in the room, one option is to go outside the Hubot API and use the Slack API to query for a list of users. The Slack API provides an endpoint, giving you all users currently in a room. To access the Slack API, we will use the built-in Hubot HTTP client. Once we have the list of members in the room we can look over this list and randomly choose a member and deliver the PR request to them:
_SECRET = undefined
anyoneButProbot = (members) -> # (1)
user = undefined
while not user
user = members[ parseInt( Math.random() * \
members.length ) ].name
user = undefined if "probot" == user
user
sendPrRequest = ( robot, body, room, url ) -> # (2)
parsed = JSON.parse( body )
user = anyoneButProbot( parsed.members )
robot.messageRoom room, "#{user}: Hey, want a PR? #{url}"
exports.prHandler = ( robot, req, res ) ->
slack_users_url = # (3)
"https://slack.com/api/users.list?token=" +
process.env.HUBOT_SLACK_TOKEN
secret = req.body?.secret # (4)
url = req.body?.url
if secret == _SECRET and url
room = "general"
robot.http( slack_users_url ) # (5)
.get() (err, response, body) ->
sendPrRequest( robot, body, \
room, url ) unless err
else
console.log "Invalid secret or no URL specified"
res.send "OK\n"
exports.setSecret = (secret) ->
_SECRET = secret
-
We define a method called
anyoneButProbot
that takes a list of users and finds a random one, as long as it is not the Hubot. -
The
sendPrRequest
method parses the JSON returned from the Slack API and then sends the members inside of the object into theanyoneButProbot
call. It then uses the Hubot API to send a message to the room asking if that user will accept the pull request review invitation. -
We build the URL to the Slack service by tacking on the Slack API token to the base Slack API URL.
-
As we did before, we pull out the secret and the PR URL, and then make sure they both exist.
-
We use the built-in HTTP client to make a GET request to the Slack API. Unless we receive an error in the response callback, we use the data provided by the Slack API to initiate the PR review request.
To test this using our cURL command, we need to modify the invocation slightly:
$ curl --data 'secret=XYZABC&url=http://pr/1' \
http://localhost:8080/pr
Our randomly selected user will see the text username: Hey, want a
PR? http://pr/1
(and the Slack client will format that link as a
clickable URL).
Unfortunately, our tests are now broken: we now have the failure TypeError:
Object #<Object> has no method 'http'
. Our mocked robot object that
we pass into our tests does not have the HTTP interface that comes
with Hubot, so we should add it to our custom Robot. The method
signature for the HTTP client (which comes from the
node-scoped-http-client
NodeJS package) is hairy: you chain calls
together to build up an HTTP client request and end up with a function
returned into which you pass a callback where you handle the response
body. This module makes you write code that is not particularly
testable (said another way, it was challenging for me to understand
what the faked test implementation should look like), but the setup
code does work and the test itself documents an interface to our robot,
which is easily understandable. We simulate the
same chain, defining an http
attribute on the mocked robot object, an
attribute that resolves to a function call itself. Calling that function
returns an object that has a get
method, and calling that function
returns a function callback that when called executes that function
with three parameters. In real life that function callback would
contain the error code, the response object, and the JSON. In our
case, as long as the error code is empty, our implementation will
parse the JSON for members, and then issue the PR request:
json = '{ "members" : [ { "name" : "bar" } , { "name" : "foo" } ] }'
httpSpy = jasmine.createSpy( 'http' ).and.returnValue(
{ get: () -> ( func ) ->
func( undefined, undefined, json ) } )
beforeEach ->
robot = {
messageRoom: jasmine.createSpy( 'messageRoom' )
http: httpSpy
}
res = { send: jasmine.createSpy( 'send' ) }
Handler.setSecret secret
it "should disallow calls without the secret", (done) ->
req = {}
Handler.prHandler( robot, req, res )
expect( robot.messageRoom ).not.toHaveBeenCalled()
expect( httpSpy ).not.toHaveBeenCalled()
expect( res.send ).toHaveBeenCalled()
done()
it "should disallow calls without the url", (done) ->
req = { body: { secret: secret } }
Handler.prHandler( robot, req, res )
expect( robot.messageRoom ).not.toHaveBeenCalled()
expect( httpSpy ).not.toHaveBeenCalled()
expect( res.send ).toHaveBeenCalled()
done()
it "should allow calls with the secret", (done) ->
req = { body: { secret: secret, url: "http://pr/1" } }
Handler.prHandler( robot, req, res )
expect( robot.messageRoom ).toHaveBeenCalled()
expect( httpSpy ).toHaveBeenCalled()
expect( res.send ).toHaveBeenCalled()
done()
The code we write here was definitely not a piece of code where testing came easy; I refactored this multiple times to find a balance between an easy-to-read test and easy-to-read code. Writing test code takes effort, but when both your tests and code are readable and minimal, you generally can be sure you have a good implementation.
We now have a functional and complete implementation of the code to retrieve a list of users and assign an incoming pull request out to a randomly selected user from that list.
Instead of using the Slack API, we can replace the code with a
much simpler call to robot.brain.users
. Calling into the Slack users
API takes a callback, but the brain.users
call does not, which
simplifies our code. We do verify inside our tests that we make a call to
the HTTP Jasmine spy on the get
function, so we will want to remove
that inside our tests. We will need to provide a new function called
users
to the Hubot inside the faked brain we created.
Unfortunately, things don’t just work when we change our code to this:
...
users = robot.brain.users()
sendPrRequest( robot, users, room, url, number )
...
It is likely that what we got back from the Slack API and what Hubot
stores inside its brain for users are functionally the same
information, but structurally stored very differently. How can we
investigate whether this assumption is correct?
NodeJS has a standard library module called util
, which includes
useful utility functions, as you might expect from the name.
One of them is inspect
, which will dig into an object and
create a pretty printed view. If we use this module and console.log
we can see the full contents of a live response object passed into our
accept
function. A line like console.log( require(
'util' ).inspect( users ) )
displays the following:
{ U04FVFE97:
{ id: 'U04FVFE97',
name: 'ben',
real_name: 'Ben Straub',
email_address: 'xxx' },
U038PNUP2:
{ id: 'U038PNUP2',
name: 'probot',
real_name: '',
email_address: undefined },
U04624M1A:
{ id: 'U04624M1A',
name: 'teddyhyde',
real_name: 'Teddy Hyde',
email_address: 'xxx' },
U030YMBJY:
{ id: 'U030YMBJY',
name: 'xrd',
real_name: 'Chris Dawson',
email_address: 'xxx' },
USLACKBOT:
{ id: 'USLACKBOT',
name: 'slackbot',
real_name: 'Slack Bot',
email_address: null } }
Ah, we were right: the Slack API returns an array while this is an
associative array (called a hash in other languages). So, we need to
refactor our inputs to the test to take an associative array instead
of an array, and then we need a function to flatten it
out (after that our code will work the same as before). We will return
that when the user calls robot.brain.users
so add a new spy as the
users
key inside our fake robot:
...
users = { CDAWSON: { name: "Chris Dawson" }, BSTRAUB: { name: "Ben Straub" } }
brainSpy = {
users: jasmine.createSpy( 'getUsers' ).and.returnValue( users ),
set: jasmine.createSpy( 'setBrain' ),
...
Inside our implementation code, flatten out the user associative array and find the user inside the new flattened array:
...
flattenUsers = (users) ->
rv = []
for x in Object.keys( users )
rv.push users[x]
rv
anyoneButProbot = ( users ) ->
user = undefined
flattened = flattenUsers( users )
while not user
user = flattened[ parseInt( Math.random() * \
flattened.length ) ].name
user = undefined if "probot" == user
user
...
Our wiring is almost complete, so let’s actually send real pull
request information. If we run our script issue-pull-request.sh
we
will see it sending data out to our Hubot. Once we have deployed to
Heroku, our Hubot is listening on a public hostname. GitHub will
accept the pull request and then send a JSON inside the body of a POST
request made to our Hubot. This JSON looks very different from the
URL-encoded parameters we provide in our cURL script, so we need to
modify our code to fit.
If we retrieve the JSON from a POST, it will look something like this (reformatted for clarity and brevity):
{
"action":"opened",
"number":13,
"pull_request": {
"locked" : false,
"comments_url" :
"https://api.github.com/repos/xrd/test_repository/issues/13/comments",
"url" : "https://api.github.com/repos/xrd/test_repository/pulls/13",
"html_url" : "https://github.com/xrd/test_repository/pulls/13",
}
...
}
Most importantly, you see a URL (the html_url
more specifically) we will use inside our Hubot
message to the user. Retrieving the JSON and parsing it is trivial
inside our Hubot:
...
exports.prHandler = ( robot, req, res ) ->
body = req.body
pr = JSON.parse body if body
url = pr.pull_request.html_url if pr
secret = pr.secret if pr
if secret == _SECRET and url
room = "general"
...
Here you see we pull out the body contents, process them as JSON, extract the secret and the URL from the parsed JSON, and then go through our normal routine.
Our tests are simple, and require that we send in JSON:
...
it "should disallow calls without the secret and url", (done) ->
req = {}
Handler.prHandler( robot, req, res )
expect( robot.messageRoom ).not.toHaveBeenCalled()
expect( httpSpy ).not.toHaveBeenCalled()
expect( res.send ).toHaveBeenCalled()
done()
it "should allow calls with the secret and url", (done) ->
req = { body: '{ "pull_request" : { "html_url" : "http://pr/1" },
"secret": "ABCDEF" }' }
Handler.prHandler( robot, req, res )
expect( robot.messageRoom ).toHaveBeenCalled()
expect( httpSpy ).toHaveBeenCalled()
expect( res.send ).toHaveBeenCalled()
done()
...
We are putting the secret inside the JSON as a convenience. The secret will not come in with the JSON when GitHub sends us JSON via the webhook, but this is an easy way to provide it to our handler for the moment. If we run our tests, they should pass now.
Our Hubot is now in a position where it will operate correctly if the secret passes validation and the webhook data is passed properly. Now we need to secure the webhook. GitHub signs your data inside the webhook payload, which provides you with a way to verify the data really came from an authorized host. We need to decode it inside our handler. To do this, we will need to retrieve the secure hash GitHub provides inside the request headers. Then, we will need to calculate the hash ourselves using the secret we maintain internally. If these hashes match, then we know the incoming request and JSON is truly from GitHub and not an attacker:
...
getSecureHash = (body, secret) ->
hash = crypto.
createHmac( 'sha1', secret ).
update( "sha1=" + body ).
digest('hex')
console.log "Hash: #{hash}"
hash
exports.prHandler = ( robot, req, res ) ->
slack_users_url =
"https://slack.com/api/users.list?token=" +
process.env.HUBOT_SLACK_TOKEN
body = req.body
pr = JSON.parse body if body
url = pr.pull_request.html_url if pr
secureHash = getSecureHash( body, _SECRET ) if body
webhookProvidedHash = req.headers['HTTP_X_HUB_SIGNATURE' ] \
if req?.headers
secureCompare = require 'secure-compare'
if secureCompare( secureHash, webhookProvidedHash ) and url
room = "general"
robot.http( slack_users_url ) ->
.get() (err, response, body) ->
sendPrRequest( robot, body, \
room, url ) unless err
else
...
The signature is a hash message authentication code (HMAC). HMAC
cryptography is vulnerable to timing attacks. When you use this
encryption technique, the time it takes to complete a comparison of
the computed hash and the sent hash can be the starting point for an attacker to gain
forced access to a server. More specifically to JavaScript, naive
comparison operators like ==
will leak this timing information.
To eliminate the risk that this information could be used to
compromise the host system, we use a module called secure-compare
that obscures this timing information when making a comparison. To
load this module, we need to add it to our package.json manifest file
with the command npm install secure-compare --save
.
Now we can adjust our tests to fit the new reality of our handler:
...
it "should disallow calls without the secret and url", (done) ->
req = {}
Handler.prHandler( robot, req, res )
expect( robot.messageRoom ).not.toHaveBeenCalled()
expect( httpSpy ).not.toHaveBeenCalled()
expect( res.send ).toHaveBeenCalled()
done()
it "should allow calls with the secret and url", (done) ->
req = { body: '{ "pull_request" : { "html_url" : "http://pr/1" }}',
headers: { "HTTP_X_HUB_SIGNATURE" :
"cd970490d83c01b678fa9af55f3c7854b5d22918" } }
Handler.prHandler( robot, req, res )
expect( robot.messageRoom ).toHaveBeenCalled()
expect( httpSpy ).toHaveBeenCalled()
expect( res.send ).toHaveBeenCalled()
done()
...
You’ll notice we moved the secret out of the JSON and into the headers. This is the same structure our Hubot will see when the GitHub webhook encodes the content of the JSON and provides us with a secure hash in the HTTP_X_HUB_SIGNATURE key. Inside our test we will need to provide the same signature inside our mocked request object. We could duplicate our secure hash generation code from the handler implementation, or we could be lazy and just run our tests once (knowing they will fail this time), watch for the console.log output that says "Hash: cd970490d83c…" and copy this hash into our mocked request object. Once we do this, our tests will pass.
Now, after reloading our Hubot, if we issue a pull request using our issue-pull-request.sh script, we should see the matching hashes. But we won’t (at least if you used the same package.json file as we specified earlier) because of a critical bug inside of Hubot at the time of this writing.
As we mentioned earlier, Hubot bundles Express.js, a high-performance web framework for NodeJS. Express.js has a modular architecture, where middleware is inserted into a request and response chain. This approach to building functionality and the wide array of middleware allows web developers to string together various standardized middleware components to use only those features needed for the problem at hand. Common middleware includes static file handlers (for serving static files), cookie handlers, session handlers, and body parsers. You can imagine circumstances where you would not need all of these (or you might need others) and this flexibility makes Express.js a popular choice for building NodeJS web applications.
The body parser middleware is of particular interest to us here: the
body parser middleware is used to convert the "body" of a request into
a JavaScript object attached to the request object. Previously you saw us
access it inside a variable we called req
inside our callback;
obviously this stands for request. The body parser takes on converting
whatever data content comes from inside the body of the HTTP request into a
structured JavaScript associative array inside the body
object within our
request object. If the body is URL encoded (as the PR information is
encoded if we create the webhook with the content_type
set to
form
), then the body parser URL decodes the content, parses it as
JSON, and then sets the inflated object to the body attribute on our
request object. Normally, this is a very handy process that removes a
lot of grunt work for web application authors.
Unfortunately, because the express
object is bundled and configured for us long
before our extension is loaded, we cannot interrupt the load order of
the body parser middleware inside our extension, which means we
cannot get access to the raw body content. The body parser middleware
processes the stream of
data by registering for events inside the HTTP request flow. NodeJS
made a mark on web application development by providing a network
application toolkit centered around one of the
most controversial features of JavaScript: the asynchronous
callback. In NodeJS, processes register for events and then return
control to the host program. In other languages, like Ruby, for
example, when building services that receive data from clients, by
default, you listen for incoming data, and the moment you tell your
program to listen, you have blocked other processing. Asynchronous
programming is by no means a new concept (threading in many languages,
for example), but NodeJS offers a simple way to interact with
asynchronous functions through event registration. In the case of
express middleware, however, this event registration process bites us,
because middleware loaded first gets first access to incoming data,
and once the body parser has processed our body content, we can no longer access the original content. We need access to the raw body
content, and there is no way to install our own middleware that would
provide it inside our Hubot extension when a PR request is received
on the router.
What options do we have then? Well, fortunately, every bit of our
stack here is open source, and we can modify the code inside Hubot
that sets up our express server to fit our needs. This code is
installed by the npm
tool in the node_modules directory, and we
can easily find where express is configured inside of Hubot. There are
issues with doing it this way: if we rerun npm install
we will blow
away our node_modules directory, and this is something Heroku will
do if it is not told otherwise. A better way might be to fork Hubot
and store our own copy of Hubot inside of GitHub and then specify our
forked copy inside of the package.json? file. This has issues too; if
Hubot gets updated with a critical security flaw, we need to merge
those changes into our fork, a maintenance issue we would avoid
if we use tagged releases from the main repository. There is,
unfortunately, no perfect way to resolve this problem that does not
itself create other problems.
If you do choose to modify the built-in Hubot code, modify the file robot.coffee inside the node_modules/hubot/src/ directory. The node_modules directory, in case memory fails, is where the NodeJS package manager (npm) builds out the local dependency tree for libraries, and this is the file Hubot uses internally to build the robot object and set up the express HTTP server. If we add the following code at line 288 (this line number might vary if you are not using the same version of Hubot we specify in our package.json), we can install a custom middleware callback that will provide us with the raw body we can use when verifying the HMAC signature:
...
app.use (req, res, next) =>
res.setHeader "X-Powered-By", "hubot/#{@name}"
next()
app.use (req, res, next) =>
req.rawBody = ''
req.on 'data', (chunk) ->
req.rawBody += chunk
next()
app.use express.basicAuth user, pass if user and pass
app.use express.query()
...
Express middleware have a very simple interface: they are nothing more
than a JavaScript function callback that receives a request, response,
and continuation function passed as parameters. We
register a listener when data content (the body) is propagated, and
then add the body content to a variable on the request object. When
the request object is passed in to our handler for pull requests within
our Hubot, we have the raw data prefilled. The next()
function is
used to indicate to the middleware host that the next middleware can
proceed.
We now need to adjust our tests to fit this new requirement. We prime
the pump with a request object that has this rawBody
inside
it, and we should properly encode the content using
encodeURIComponent
to match the format in which it will be appearing
from GitHub:
...
it "should allow calls with the secret and url", (done) ->
payload = '{ "pull_request" : { "html_url" : "http://pr/1" } }'
bodyPayload = "payload=#{encodeURIComponent(payload)}"
req = { rawBody: bodyPayload,
headers: { "x-hub-signature" : \
"sha1=dc827de09c5b57da3ee54dcfc8c5d09a3d3e6109" } }
Handler.prHandler( robot, req, res )
expect( robot.messageRoom ).toHaveBeenCalled()
expect( httpSpy ).toHaveBeenCalled()
expect( res.send ).toHaveBeenCalled()
done()
...
Our implementation breaks our tests, so we will need to modify the
cost to use the rawBody
attribute on the request object, break it
apart from the payload key/value pair, URI decode it, and then if all
that works, parse the JSON and start the verification process. Our
tests describe all this for us. The new prHandler
method looks like
this:
...
exports.prHandler = ( robot, req, res ) ->
rawBody = req.rawBody
body = rawBody.split( '=' ) if rawBody
payloadData = body[1] if body and body.length == 2
if payloadData
decodedJson = decodeURIComponent payloadData
pr = JSON.parse decodedJson
if pr and pr.pull_request
url = pr.pull_request.html_url
secureHash = getSecureHash( rawBody )
signatureKey = "x-hub-signature"
if req?.headers
webhookProvidedHash =
req.headers[ signatureKey ]
secureCompare = require 'secure-compare'
if url and secureCompare( "sha1=#{secureHash}",
webhookProvidedHash )
room = "general"
users = robot.brain.users()
sendPrRequest( robot, users, room, url )
else
console.log "Invalid secret or no URL specified"
else
console.log "No pull request in here"
res.send "OK\n"
_GITHUB = undefined
...
When all is said and done, is verifying the signature even worth it? If we are not hosting our Hubot on a service that handles our router requests over HTTPS, this HMAC verification could be compromised. And, given the issues with maintaining our own copy of the Hubot code in order to permit the validation inside our Hubot extension, it might be best to ignore the validation header. The worst case, as our extension is written now, would be that an attacker could fake a pull request notification, and falsely engage chat room users around it. If the PR the attacker used was fake, it might confuse our Hubot, but no real harm would be done. If they used an existing real PR, an attacker could trick our Hubot into adding data to the PR, adding confusion in the comments about who accepted the review request. We won’t solve that potential problem with this code, but you can imagine adding code to our Hubot that handles a case like this (for example, by checking first to see if someone was already tagged on the PR, and ignoring successive incoming webhooks associated with that PR).
Our Hubot is now programmed to generate a pull request review message and send it to a random user. What happens when they respond? They can respond in two ways obviously: accepting the request or declining the request. We put placeholders in our Hubot extension to notify us with a debugging message when the user responds and send a message back to whoever sent us a message, but now we can actually wire up handling the response and adding to the pull request on GitHub based on the user we are interacting with (provided they accepted).
There are multiple ways in which a Hubot can interact with chat room messages. We chose the respond
method, but there is another method called hear
we could have used. respond
is used when the message is preceded by the Hubot name, so only messages that look like probot: accept
or @probot decline
or / accept
(if the Hubot name alias is enabled) will be processed by our Hubot. We could have used hear
but in our case we are processing a simple response, and without a clear direction for the message, it would be difficult to always make sure we were interpreting the message in the correct context. respond
makes more sense here.
If they decline the request, let’s just graciously note that the offer was declined:
...
exports.decline = ( res ) ->
res.reply "No problem, we'll go through this PR in a bug scrub"
...
We are asking someone to accept a pull request, and there is a possible
situation where two could come in within a very short period of
time. For this reason, it probably makes sense for us to indicate the
pull request identifier in the communication with the target
user. And, users should be told to reply with a string like accept
112
. The Hubot can then interpret this to mean they are accepting PR
#112 and not the other pull request the Hubot invited John to
respond to 10 seconds later.
If we do this, our Hubot does need to save the state of pull request
invitations. Fortunately, there is an extremely easy way to do this
using the "brain" of our Hubot. The brain is a persistent store,
typically backed by Redis, into which you can keep any type of
information. You simply reference the robot.brain
and use methods
like get
or set
to retrieve and store information. The set
method takes any key and any value but note that the Hubot brain does
not do much with your value if that value happens to be a complex
object; if you want to properly serialize something beyond a flat
value, you should probably call JSON.stringify
on the object to
maintain full control over the roundtrip storing and retrieval.
Let’s modify our Hubot handler to deal with accepting or declining responses (and change our extension file to deal with this new interface). Of course, we will need to add to our tests. Finally, we will need to set up a way to provide the GitHub API key to our Hubot handler, so we’ll add a method to do that that looks almost exactly like the one for setting our secret key.
We’ll use a GitHub API NodeJs module called node-github
, found on
GitHub at https://github.com/mikedeboer/node-github. If we look
at the API documentation, we see that it supports authentication using
an OAuth token (using the github.authenticate( { 'type' : 'oauth': 'token' : '...' }
syntax), and has methods we can use to add a comment to an
issue or pull request associated with a repository (using the
github.issues.createComment
method).
Knowing that this module handles most of the work for us between these
two methods, we can start by writing our tests. We’ll create a new
describe block called #response
that groups our tests together. As
we noted earlier, our Hubot can take affirmative and negative
responses, so our tests should reflect these two code paths. Our setup
block (the beforeEach
section) in both cases should do the same
thing for each response—make the pull request invitation to a random user: this all
happens inside our prHandler
code. We don’t need to verify the
expectations of this method since that got that covered by prior
tests. After we get our handler to the right state, we need to test
that the handler works correctly with an accept
and decline
method
(they don’t yet exist in our handler code so we’ll add them
next).
Our accept request handler triggers our Hubot to
contact GitHub and add a comment to the pull request noting
our targeted chat user accepted the request. The network
connection to the GitHub API uses the GitHub API bindings from within
the node-github
module. We want to make this testable, so we should pass in the
GitHub binding object inside our interface, and during the test, pass
in a mocked object. If we review the documentation for the
createComment
in the GitHub API binding, we see it requires
information about the repository such as the user or organization
that owns the repository, the repository name, the issue number (pull
requests are also referenced by issue numbers), and the comment
itself. To get this information we simply need to decode it from the
Hubot handler that receives the pull request information, and we
will add code that does this (and is exposed in our handler for
testing). We saw that a pull request comes in through a large JSON
response, and we can use the URL we used earlier as the way we decode
this information. So, we’ll need to have two more tests inside our
#response
block, one for the decoding of the URL into a message
object, and another to retrieve the username we insert into the
comment stored in the pull request on the repository. We know what our
test URL looks like since we saw it in our PR webhook message, but we
don’t yet have the structure of the chat message from which we can
pull out our username, so our test will need to be adjusted when we
know what it really looks like.
Declining the request means nothing happens. If we
mock out our GitHub API binding, acceptance should log in (using the
authenticate
method) and then call createComment
. These are
directly pulled from the GitHub API NodeJS documentation. Finally, we
should record the result of this operation inside the chat room, which
happens using the reply method on our response object:
...
describe "#response", ->
createComment = jasmine.createSpy( 'createComment' ).and.
callFake( ( msg, cb ) -> cb( false, "some data" ) )
issues = { createComment: createComment }
authenticate = jasmine.createSpy( 'ghAuthenticate' )
responder = { reply: jasmine.createSpy( 'reply' ),
send: jasmine.createSpy( 'send' ) }
beforeEach ->
githubBinding = { authenticate: authenticate, \
issues: issues }
github = Handler.setApiToken( githubBinding, \
"ABCDEF" )
req = { body: '{ "pull_request" : \
{ url : "http://pr/1" } }', \
headers: { "HTTP_X_HUB_SIGNATURE" : \
"cd970490d83c01b678fa9af55f3c7854b5d22918" } }
Handler.prHandler( robot, req, responder )
it "should tag the PR on GitHub if the user accepts", (done) ->
Handler.accept( responder )
expect( authenticate ).toHaveBeenCalled()
expect( createComment ).toHaveBeenCalled()
expect( responder.reply ).toHaveBeenCalled()
done()
it "should not tag the PR on GitHub if the user declines", \
(done) ->
Handler.decline( responder )
expect( authenticate ).toHaveBeenCalled()
expect( createComment ).not.toHaveBeenCalledWith()
expect( responder.reply ).toHaveBeenCalled()
done()
it "should decode the URL into a proper message object " + \
"for the createMessage call", (done) ->
url = "https://github.com/xrd/testing_repository/pull/1"
msg = Handler.decodePullRequest( url )
expect( msg.user ).toEqual( "xrd" )
expect( msg.repository ).toEqual( "testing_repository" )
expect( msg.number ).toEqual( "1" )
done()
it "should get the username from the response object", (done) ->
res = { username: { name: "Chris Dawson" } }
expect( Handler.getUsernameFromResponse( res ) ).toEqual \
"Chris Dawson"
done()
Note that this code was indented to save space, but yours will be nested in several deeper levels of indentation. Refer to the sample repository for the exact code if there is confusion.
Our tests will fail if we run them now. So, let’s write the code at the end of our delegator extension. We need code that parses the URL into the appropriate structured message object, code to put the reminder into the pull request comment on GitHub, and code that pulls the user out of the response object passed to us. The first two of these are within reach; basic JavaScript and reading the GitHub API binding documentation will get us to these two. The third one requires a little more investigation, so we will leave this as a placeholder for now.
To convert the URL into the object necessary for the createMessage
call, we just need to split the message into pieces by the slash
character, and then retrieve the correct items by index. We probably
could add some additional tests that cover passing in empty strings,
or other edge cases, but we’ll leave it as an exercise to the
reader. Our code does not crash in these cases, but it would be
nice to have coverage of our expectations represented in our tests:
...
_GITHUB = undefined
_PR_URL = undefined
exports.decodePullRequest = (url) ->
rv = {}
if url
chunks = url.split "/"
if chunks.length == 7
rv.user = chunks[3]
rv.repository = chunks[4]
rv.number = chunks[6]
rv
exports.getUsernameFromResponse = ( res ) ->
"username"
exports.accept = ( res ) ->
msg = exports.decodePullRequest( _PR_URL )
username = exports.getUsernameFromResponse( res )
msg.body = "@#{username} will review this (via Probot)."
_GITHUB.issues.createComment msg, ( err, data ) ->
unless err
res.reply "Thanks, I've noted that in a PR comment!"
else
res.reply "Something went wrong, " + \
"I could not tag you on the PR comment."
exports.decline = ( res ) ->
res.reply "OK, I'll find someone else."
console.log "Declined!"
exports.setApiToken = (github, token) ->
_API_TOKEN = token
_GITHUB = github
_GITHUB.authenticate type: "oauth", token: token
exports.setSecret = (secret) ->
_SECRET = secret
To summarize, we added an internal variable called _GITHUB
where we will store a
reference to our instantiation of the GitHub API binding. Our
interface to the setApiToken
call passes in the instantiation; this
method takes our OAuth token and the binding because using an
interface like this means we can pass in a mocked binding inside our
tests. When we are not running inside a test, this method call
authenticates against the GitHub API, readying the API binding to make
connections to the GitHub API itself.
Our top-level extension script looks like this now:
handler = require '../lib/handler'
handler.setSecret "XYZABC"
github = require 'node-github'
handler.setApiToken github, "12345ABCDEF"
module.exports = (robot) ->
robot.respond /accept/i, ( res ) ->
handler.accept( res )
robot.respond /decline/i, ( res ) ->
handler.decline( res )
robot.router.post '/pr', ( req, res ) ->
handler.prHandler( robot, req, res )
If you were to look only at this code, the interface is clean, and the bulk of the work is handled by our very testable handler.
We need to get the username, and it stands to reason that the object passed
to us when we get a respond callback might have it in there. The
respond
method provided by the Hubot API is documented mostly by
way of the example scripts that come with Hubot. There is very little
information on what the parameter passed to your callback looks
like. Let’s use the util
library to inspect the data and print it to
the console. We abbreviate the full output here, and show you that it
contains information on the user who sent the message to our
Hubot. We can access this information by using
response.message.user.name
if, for example, we wanted to retrieve
the name of the user:
{ robot:
{ name: 'probot',
brain:
{ data: [Object],
...
message:
{ user:
{ id: '...',
name: 'xrd',
real_name: 'Chris Dawson',
email: '[email protected]'
...
text: 'probot accept',
rawText: 'accept',
rawMessage:
{ _client: [Object],
...
match: [ 'probot accept', index: 0, input: 'probot accept' ],
...
}
Inside it all we can find information we need, specifically the username and email. So, let’s update our test and our handler code. The last test in our spec file can be modified to look like this:
...
it "should get the username from the response object", (done) ->
res = { message: { user: { name: "Chris Dawson" } } }
expect( Handler.getUsernameFromResponse( res ) ).toEqual "Chris Dawson"
done()
...
And, our handler code defining getUsernameFromResponse
simply turns into this:
...
exports.getUsernameFromResponse = ( res ) ->
res.message.user.name
...
With this information in hand, we can properly comment on the pull request. Well, almost.
If the Slack username for the person who accepted the pull request is an exact match with their GitHub username, then we can assume they are the same person in real life and create a comment inside the pull request reminding them (and anyone else) that they will be reviewing the PR. We can use the collaborator subsection of the Repository API to look up their name on GitHub.
If we don’t find them inside the list of users and there is not an exact match with their Slack name then we have at least one problem, maybe two. First, we could just have a mismatch in their identities (their usernames are different on each site). If this is the case, we could ask them to clarify this inside the Slack room. We do have another case: the user is not a collaborator on the repository hosted on GitHub. If this is the case, clarifying their username is not going to help. The Repository API does support adding a user to the list of collaborators so we could do that here, but this arguably is a moment where a larger discussion should happen (write access to a repository is a big resposibility in a way that being inside a chat room is not). Adding a user as a repository collaborator should not be automated inside a chat room. Because of the complexity here, we will write code to unify a username inside the chat room, but we won’t handle the case where there is no clarification to be made because they are not in the repository collaborator list.
Using the GitHub API binding we passed into our setApiToken
call we will verify the user exists as a collaborator on the
repository. The API binding provides a method called getCollaborator
inside the repos
namespace we can use to verify that a
username is on the list of collaborators. It takes as the first
parameter a message that is used to specify the repository and
owner, and then an attribute called collabuser
, which is the name you
want to ensure is a collaborator. The second parameter to the function
is a callback that is executed once the request has completed. If the
callback returns without an error code, then our Hubot should tag the
pull request with a comment confirming and message the room.
Our new test reflects usage of the repos.getCollaborator
call. In
our test setup block we mock out the call to getCollaborator
and use Jasmine to "spy on" it so we can assure it was called later
in our actual test. Our setup is more beefy than before, but we are
following the same patterns of generating spies to watch methods, and
implementing our fake callbacks when necessary. We can also move our
message inside the response object into the one created in our setup
block so that we can use it inside all of our subtests, rather than
creating a new object for each test inside the test body:
...
send: jasmine.createSpy( 'send' ),
message: { user: { name: "Chris Dawson" } } }
getCollaborator = jasmine.createSpy( 'getCollaborator' ).and.
callFake( ( msg, cb ) -> cb( false, true ) )
repos = { getCollaborator: getCollaborator }
...
it "should tag the PR on GitHub if the user accepts", (done) ->
Handler.accept( robot, responder )
expect( authenticate ).toHaveBeenCalled()
expect( createComment ).toHaveBeenCalled()
expect( responder.reply ).toHaveBeenCalled()
expect( repos.getCollaborator ).toHaveBeenCalled()
done()
Our handler can then implement the accept and decline methods in full:
...
exports.accept = ( robot, res ) ->
prNumber = res.match[1]
url = robot.brain.get( prNumber )
msg = exports.decodePullRequest( url )
username = exports.getUsernameFromResponse( res )
msg.collabuser = username
_GITHUB.repos.getCollaborator msg, ( err, collaborator ) ->
msg.body = "@#{username} will review this (via Probot)."
_GITHUB.issues.createComment msg, ( err, data ) ->
unless err
res.reply "Thanks, I've noted that " + \
"in a PR comment. " + \
"Review the PR here: #{url}"
else
res.reply "Something went wrong." + \
"I could not tag you " + \
"on the PR comment: " +
"#{require('util').inspect( err )}"
exports.decline = ( res ) ->
res.reply "No problem, we'll go through this PR in a bug scrub"
...
We now have a full implementation of both the accept
and decline
methods inside our Hubot.
It is typically bad form to save passwords (or other access credentials, like OAuth tokens or secrets) inside of source code. Right now we have hardcoded them into our application inside of the pr-delegator.coffee file. We could instead retrieve them from the environment of the running process:
...
handler.setSecret process.env.PROBOT_SECRET
github = require 'github'
ginst = new github version: '3.0.0'
handler.setApiToken ginst, process.env.PROBOT_API_TOKEN
...
When we launch our Hubot from the command line, we will need to use a command like this as we are testing locally from our laptop:
$ PROBOT_SECRET=XYZABC \
PROBOT_API_TOKEN=926a701550d4dfae93250dbdc068cce887531 \
HUBOT_SLACK_TOKEN=xoxb-3295776784-nZxl1H3nyLsVcgdD29r1PZCq \
./bin/hubot -a slack
When we publish into Heroku, we will want to set these as environment variables using the appropriate Heroku commands:
$ heroku config:set PROBOT_API_TOKEN=926a701550d4dfae93250dbdc068cce887531
Adding config vars and restarting myapp... done, v12
PROBOT_API_TOKEN=926a701550d4dfae93250dbdc068cce887531
$ heroku config:set PROBOT_SECRET=XYZABC
Adding config vars and restarting myapp... done, v12
PROBOT_SECRET=XYZABC
Don’t forget that when we run our tests, we will need to specify the environment variables on the command line as well:
$ PROBOT_SECRET=XYZABC \
PROBOT_API_TOKEN=926a701550d4dfae93250dbdc068cce887531 \
node_modules/jasmine-node/bin/jasmine-node --coffee \
spec/pr-delegator.spec.coffee
Our Hubot is alive! We went through building a robot that can interact with us inside a chat room, then refactored the robot so that its functionality is contained into a highly testable module. Along the way, we got intimate with the Hubot API, and even discussed how to modify (and the drawbacks surrounding) modifying the source code to Hubot itself. Finally, we demonstrated how to use the Activity API receiving (and faking data) coming from a GitHub webhook.
In the next chapter we will look at building a single-page application that edits information inside a GitHub repository using JavaScript and the GitHub.js library talking to the Pull Request API.