Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correlates client-side and server segment anonymous_id's #254

Merged

Conversation

discorick
Copy link
Member

Closes #253 #162

@dahlbyk dahlbyk deployed to huboard-rails-pr-254 March 28, 2016 18:40 Active
@discorick
Copy link
Member Author

debugger_-_segment

My only concern is the top step in the pipeline, the page() method doesn't seem to want to overwrite the anonymous_id, I am not sure how the GA integration will handle collapsing that and the properties hash.

Starting from the bottom highlighted payload:

{
  "anonymousId": "f9275b47-dbb6-4bba-b3f8-db5f871dbefc",
  "category": null,
  "context": {
    "page": {
      "path": "/login",
      "referrer": "http://huboard.rails/",
      "search": "",
      "title": "Huboard",
      "url": "http://huboard.rails/login"
    },
    "userAgent": "blahblahblah",
    "library": {
      "name": "analytics.js",
      "version": "2.11.1"
    },
    "ip": "68.146.230.217"
  },
  "integrations": {},
  "messageId": "ajs-bf209c1cc0373d8215d9fcc459a4fb23",
  "name": null,
  "properties": {
    "anonymousId": "5bc579b368d3ec787c1355a561d533ac",
    "path": "/login",
    "referrer": "http://huboard.rails/",
    "search": "",
    "title": "Huboard",
    "url": "http://huboard.rails/login"
  },
  "receivedAt": "2016-03-28T18:33:58.514Z",
  "sentAt": "2016-03-28T18:33:58.487Z",
  "timestamp": "2016-03-28T18:33:58.513Z",
  "type": "page",
  "userId": "1130665",
  "originalTimestamp": "2016-03-28T18:33:58.486Z"
}
{
  "userId": "Anonymous",
  "anonymousId": "5bc579b368d3ec787c1355a561d533ac",
  "name": "/login/private",
  "category": null,
  "properties": {
    "url": "/login/private"
  },
  "integrations": {},
  "context": {
    "library": {
      "name": "analytics-ruby",
      "version": "2.0.13"
    }
  },
  "timestamp": "2016-03-28T18:33:59.818Z",
  "type": "page",
  "messageId": "b30b5b83-5820-4012-8d75-c54eb9f7316d",
  "writeKey": "redacted",
  "sentAt": "2016-03-28T18:34:34.448Z",
  "receivedAt": "2016-03-28T18:34:34.851Z",
  "originalTimestamp": "2016-03-28T11:33:59.415-07:00"
}
{
  "userId": "1130665",
  "anonymousId": "9a3d0d56f08f65401416196a5dff89cc",
  "name": "/login/private/authorized",
  "category": null,
  "properties": {
    "url": "/login/private/authorized"
  },
  "integrations": {},
  "context": {
    "library": {
      "name": "analytics-ruby",
      "version": "2.0.13"
    }
  },
  "timestamp": "2016-03-28T18:34:07.068Z",
  "type": "page",
  "messageId": "0b5af0e8-1096-4b48-8fed-5e2aa8e3cf83",
  "writeKey": "redacted",
  "sentAt": "2016-03-28T18:34:34.723Z",
  "receivedAt": "2016-03-28T18:34:35.146Z",
  "originalTimestamp": "2016-03-28T11:34:06.645-07:00"
}

@@ -6,7 +6,8 @@
window.analytics.identify('<%= current_user.id %>', {})
window.analytics.page({ userId: '<% current_user.id %>' });
<% else %>
window.analytics.page();
window.analytics.identify({ anonymousId: '<%= session.id %>' });
window.analytics.page({ anonymousId: '<%= session.id %>' });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also set anonymousId when logged_in??

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think so.

@discorick
Copy link
Member Author

So, oddly enough Warden sees fit to instruct rack to reset the session id after every login. I have tried getting around this using callbacks:

Warden::Manager.after_set_user do |user,auth,opts|
  auth.env["rack.session.options"][:renew] = false
end

That didn't work, switching after_set_user with on_request callback didn't help either, both happen after the rack session id has been renewed.

@discorick discorick deployed to huboard-rails-pr-254 March 28, 2016 21:33 Active
if opts[:store] != false && opts[:event] != :fetch
options = env[ENV_SESSION_OPTIONS]
session_serializer.store(user, scope)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For our future reference, set_user is copy/pasta from https://github.com/hassox/warden/blob/cb09b959bcac82a087ff0f2938273e6924d845d8/lib/warden/proxy.rb#L172-L176 except:

   if opts[:store] != false && opts[:event] != :fetch
     options = env[ENV_SESSION_OPTIONS]
-    options[:renew] = true if options
     session_serializer.store(user, scope)
   end

I know we're considering a switch to OmniAuth (#157), but I wonder if we should submit a patch to Warden along the lines of:

```diff
   if opts[:store] != false && opts[:event] != :fetch
     options = env[ENV_SESSION_OPTIONS]
-    options[:renew] = true if options
+    options[:renew] = true if options && !opts[:renew]
     session_serializer.store(user, scope)
   end

@dahlbyk
Copy link
Member

dahlbyk commented Mar 29, 2016

So the session reset seems to be attack mitigation technique, which we probably don't want to lose just in the name of analytics.

Two options:

  • Read Segment's ajs_anonymous_id cookie from each request and pass it down through.
  • Add something server-side to inject our own correlation ID into session when one is created, and ensure that value is preserved through login.

The latter seems more durable in the presence of ad blockers.

…tion attacks, sets a guid per session as anon id via rack middleware
@discorick discorick deployed to huboard-rails-pr-254 March 30, 2016 18:25 Active
@discorick
Copy link
Member Author

@dahlbyk , I opted for option number 2 - your right, session reset was designed to prevent session fixation attacks.

@dahlbyk dahlbyk deployed to huboard-rails-pr-254 March 31, 2016 06:16 Active
@dahlbyk dahlbyk deployed to huboard-rails-pr-254 March 31, 2016 06:28 Active
@dahlbyk
Copy link
Member

dahlbyk commented Mar 31, 2016

Also, while we're here: I noticed that we're still including the previous userId in payloads even after Logout (between pages /repositories/private/dahlbyk and / at the top):

firefox_2016-03-31_09-24-01

Speaking of which, we should probably track /logout server-side before actually logging out.

@dahlbyk
Copy link
Member

dahlbyk commented Mar 31, 2016

Try yet another approach to set anonymousId client-side

This seems to be working as expected. My only doubt now is that I appear to have seen my guid change when clicking from my initial dashboard (via /login/public) to the Private tab. Not sure if there's something weird happening there with the bump in auth scope, or if I'm just seeing things.

@discorick
Copy link
Member Author

My only doubt now is that I appear to have seen my guid change when clicking from my initial dashboard (via /login/public) to the Private tab

You are correct, when we upgrade auth scope this is happening:

request.env['warden'].logout if github_authenticated? :default
effectively clearing the session entirely (see app/controllers/dashboard_controller.rb LN 47)

The question is should we fix this now, or as part of our improved login flow?

@dahlbyk
Copy link
Member

dahlbyk commented Mar 31, 2016

The question is should we fix this now, or as part of our improved login flow?

What happens if we just delete that line (and these)? AFAICT, the logout only seems to be a problem from an Analytics session cohesion standpoint.

…non_id' of github.com:huboard/huboard-web into discorick/correlates_client_and_server_analytics_with_anon_id
@discorick discorick deployed to huboard-rails-pr-254 March 31, 2016 20:33 Active
@discorick
Copy link
Member Author

After removing the explicit logouts on scope change everything seems to still function as it should, and our anon guid remains consistent across the board.

@rauhryan , I remember way back when we moved to rails there was a reason for the logouts - it seems thats no longer the case (see 264d998 )?

@dahlbyk
Copy link
Member

dahlbyk commented Mar 31, 2016

While we wait for @rauhryan to refresh his memory, can you check on resetting userId on Logout:

I noticed that we're still including the previous userId in payloads even after Logout (between pages /repositories/private/dahlbyk and / at the top)

I would expect the session to be reset altogether (anonymousId, too).

@discorick
Copy link
Member Author

It looks like Segment is caching the traits client-side, so even though the session is lost analytics.js can still tell who the user is (this includes all the keys in the traits object)

I don't see any inherent risk here as we are only storing publicly available GitHub data.

@dahlbyk
Copy link
Member

dahlbyk commented Apr 1, 2016

I don't see any inherent risk here as we are only storing publicly available GitHub data.

It's just a correctness thing. Analytics.js does provide reset() that we could inject client-side. I'd rather ship this than worry too much about tracking after logout.

@dahlbyk
Copy link
Member

dahlbyk commented Apr 4, 2016

After removing the explicit logouts on scope change everything seems to still function as it should, and our anon guid remains consistent across the board.

@rauhryan , I remember way back when we moved to rails there was a reason for the logouts - it seems thats no longer the case (see 264d998 )?

I've kicked the tires quite a bit on this...not seeing anything unexpected through various login/logout cycles starting from public and private and then upgrading or downgrading. Merging.

@dahlbyk dahlbyk merged commit a9d3225 into master Apr 4, 2016
@dahlbyk dahlbyk deleted the discorick/correlates_client_and_server_analytics_with_anon_id branch April 4, 2016 22:06
@rauhryan
Copy link
Member

@discorick @dahlbyk

I remember the reason for adding the explicit logout was to force an OAuth change request to add the ability to "downgrade" HuBoard's access

But it sounds like GitHub has changed the default behavior of their OAuth flow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants