Skip to content
This repository has been archived by the owner on Feb 8, 2018. It is now read-only.

use an ENUM for is_suspicious #567

Closed
chadwhitacre opened this issue Jan 30, 2013 · 11 comments
Closed

use an ENUM for is_suspicious #567

chadwhitacre opened this issue Jan 30, 2013 · 11 comments

Comments

@chadwhitacre
Copy link
Contributor

We have three states for is_suspicious:

  • true - blacklisted
  • false - whitelisted
  • null - unreviewed

We should us an ENUM instead of a boolean for that (this was raised on an HN thread when we announced the Delpan Incident).


Want to back this issue? Place a bounty on it! We accept bounties via Bountysource.

@bruceadams
Copy link
Contributor

Using true and false as ENUM values will make the code confusing to read and modify. Are you suggesting the three ENUM values of blacklisted, whitelisted and unreviewed?

How about changing the variable name to credit_card_status with ENUM values of unknown, good, suspicious?

@joonas
Copy link
Contributor

joonas commented Feb 1, 2013

I'm 👍 for this, I think it would make some of the code simpler, and we should be able to get rid of some of the SQL in the ORM .filter methods.

As it happens, SQLAlchemy also has built-in support for using Enum column types :)

@chadwhitacre
Copy link
Contributor Author

How about changing the variable name to credit_card_status with ENUM values of unknown, good, suspicious?

I was thinking the property attached to the participant account overall, not just their credit card info. We also don't want to pay out to bank accounts attached to suspicious accounts. I think we keep it as is_suspicious for now, and maybe in the future it becomes more sophisticated, like a trust weighting or some such.

How about is_suspicious:

  • yes
  • no
  • maybe (default)

@bruceadams
Copy link
Contributor

This is related to issue #588. We need to be clear what the business rules are for avoiding fraud.

Here is a sketch of a gittip user's path to legitimacy.

suspicious

I'm assuming that we only care about user legitimacy for moving money into or out of Gittip. A user who has neither a credit card nor a bank account tied to Gittip can receive tips and re-gift those tips to others. No review of such a user is needed.

Once a user is setup to move money into or out of Gittip, we need to check if the user appears suspicious. Precisely speaking that means the user has outgoing tips pledged and a credit card on file or the user has incoming tips and a bank account connected to Gittip. I think we want to be proactive about reviewing accounts. We should review any account whenever it first adds a credit card or bank account, regardless of pending tip activity.

I see four states of legitimacy here:

  • limited - We don't yet care if the user is legitimate; cannot yet move money.
  • needs_review - The user is prepared to move money and needs to be reviewed.
  • legitimate - The user can move money.
  • suspicious - The user cannot receive or give tips at all; more restricted than limited.

@chadwhitacre
Copy link
Contributor Author

business rules

You and @joonas are in cahoots, aren't you? ;-)

user appeal

This is good to have on the table. We haven't really provided for this yet.

Another case:

  • Legitimate user has their account compromised and it's used with their credit card or other stolen cards. legit -> suspicious

@chadwhitacre
Copy link
Contributor Author

I see four states of legitimacy here:

I believe the first two states of the four you list are currently handled by checking whether the participant has a card/account attached (specifically, either last_bill_result or last_ach_result is not null). Are you suggesting it'd be better to have that state in the ENUM rather than querying for it?

@bruceadams
Copy link
Contributor

have that state in the ENUM rather than querying for it?

Good question! No. In general I prefer storing events and figuring out state at runtime. Most of the events I described are participant actions: joining Gittip, adding a credit card, adding a bank account and my proposed request appeal of being locked. I wanted to be clear (that business rules thing again 😄) about when we need to check the legitimacy of a participant.

The event currently stored in is_suspicious amounts to a Gittip administrator decision about the legitimacy of a participant. Payday processing should only move real money for participants that have been cleared. Payday processing (preprocessing?) should highlight participants that need to be reviewed. We don't want to leave a participant in limbo without our knowing about it.

I'd rather have our participant review stored more completely as an event. Who reviewed the participant (even if this is automated as some point, record it as such), when, what was the outcome (legit or suspicious) and maybe some free text commentary about the review.

legit -> suspicious

This is another event: our learning of a compromise for a participant. We'll need to lock the account. This has the potential to be messy. There are several different possible compromises. I doubt we want to guess in advance all possible compromises. Some examples:

  • A participant's credit card has been stolen.
  • A participant's Twitter account is compromised, allowing an attacker to switch the bank account for payouts.

Anyway, a compromise is also an event. I assume our reaction is the same: fully lock the participant account.

It's starting to sound like we should add a new database table to hold legitimacy decision events. Is that too heavy? I guess we could denormalize current state into each participant's is_suspicious column, if we need to.

@chadwhitacre
Copy link
Contributor Author

I prefer storing events and figuring out state at runtime.

This is great advice. I've been sort of stumbling towards this but hadn't cognized it so clearly. Thank you! :-)

I implemented tips this way more or less accidentally, but I've liked it so much that I've ticketed converting other pieces of data to this model (#141, #342, #343). My next question here then is whether you end up with a single events table (in the limit case) or whether the store is still ... "schemaful"? And then do you compute state using views? Stored procedures? App code? Some combination? How do you decide which and when?

@dmitshur
Copy link
Contributor

There's nothing wrong with caching some things, especially if the calculation is non-trivial, changes are infrequent, but it is accessed often. You must ensure the cache is generated and gets updated automatically; there can't be human effort required for maintenance (after the initial setup).

@chadwhitacre
Copy link
Contributor Author

Relevant: "Event Sourcing"

Capture all changes to an application state as a sequence of events.

We can query an application's state to find out the current state of the world, and this answers many questions. However there are times when we don't just want to see where we are, we also want to know how we got there.

Event Sourcing ensures that all changes to application state are stored as a sequence of events. Not just can we query these events, we can also use the event log to reconstruct past states, and as a foundation to automatically adjust the state to cope with retroactive changes.

http://www.martinfowler.com/eaaDev/EventSourcing.html

ht @seejee

chadwhitacre referenced this issue in joealcorn/www.gittip.com May 3, 2013
TODO: Actually store the email address and send email with confirmation link
@chadwhitacre
Copy link
Contributor Author

We've lived with is_suspicious as a nullable bool long enough, we don't need to change it to an enum. Evolving suspicious will mean going to a float instead of a bool anyway.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants