-
Notifications
You must be signed in to change notification settings - Fork 48
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
8 changed files
with
223 additions
and
35 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
# About | ||
|
||
Tribe is a utility that will allow you to extract a network (a graph) from a communication network that we all use often - our email. Tribe is designed to read an email mbox (a native format for email in Python)and write the resulting graph to a GraphML file on disk. This utility is generally used for District Data Labs' Graph Analytics with Python and NetworkX course, but can be used for anyone interested in studying networks. | ||
|
||
## Contributing | ||
|
||
Tribe is open source, and I'd love your help. If you would like to contribute, you can do so in the following ways: | ||
|
||
1. Add issues or bugs to the bug tracker: [https://github.com/DistrictDataLabs/tribe/issues](https://github.com/DistrictDataLabs/tribe/issues) | ||
2. Work on a card on the dev board: [https://waffle.io/DistrictDataLabs/tribe](https://waffle.io/DistrictDataLabs/tribe) | ||
3. Create a pull request in Github: [https://github.com/DistrictDataLabs/tribe/pulls](https://github.com/DistrictDataLabs/tribe/pulls) | ||
|
||
Note that labels in the Github issues are defined in the blog post: [How we use labels on GitHub Issues at Mediocre Laboratories](https://mediocre.com/forum/topics/how-we-use-labels-on-github-issues-at-mediocre-laboratories). | ||
|
||
If you are a member of the District Data Labs Faculty group, you have direct access to the repository, which is set up in a typical production/release/development cycle as described in _[A Successful Git Branching Model](http://nvie.com/posts/a-successful-git-branching-model/)_. A typical workflow is as follows: | ||
|
||
1. Select a card from the [dev board](https://waffle.io/DistrictDataLabs/tribe) - preferably one that is "ready" then move it to "in-progress". | ||
|
||
2. Create a branch off of develop called "feature-[feature name]", work and commit into that branch. | ||
|
||
~$ git checkout -b feature-myfeature develop | ||
|
||
3. Once you are done working (and everything is tested) merge your feature into develop. | ||
|
||
~$ git checkout develop | ||
~$ git merge --no-ff feature-myfeature | ||
~$ git branch -d feature-myfeature | ||
~$ git push origin develop | ||
|
||
4. Repeat. Releases will be routinely pushed into master via release branches, then deployed to the server. | ||
|
||
## Contributors | ||
|
||
Thank you for all your help contributing to make Tribe a great project! | ||
|
||
### Maintainers | ||
|
||
- Benjamin Bengfort: [@bbengfort](https://github.com/bbengfort/) | ||
|
||
### Contributors | ||
|
||
- Your name welcome here! | ||
|
||
## Changelog | ||
|
||
The release versions that are sent to the Python package index (PyPI) are also tagged in Github. You can see the tags through the Github web application and download the tarball of the version you'd like. | ||
|
||
The versioning uses a three part version system, "a.b.c" - "a" represents a major release that may not be backwards compatible. "b" is incremented on minor releases that may contain extra features, but are backwards compatible. "c" releases are bug fixes or other micro changes that developers should feel free to immediately update to. | ||
|
||
### Version 1.1.2 | ||
|
||
* **tag**: [v1.1.2](https://github.com/DistrictDataLabs/tribe/releases/tag/v1.1.2) | ||
* **release**: Thursday, November 20, 2014 | ||
* **deployment**: Friday, March 11, 2016 | ||
* **commit**: [69fe3c6](https://github.com/DistrictDataLabs/tribe/commit/69fe3c69130899479be2e33f73872d6cfedd4659) | ||
|
||
This is the initial release of Tribe that has been used for teaching since the first SNA workshop in 2014. This version was cleaned up a bit, with extra dependency removal and better organization. This is also the first version that was deployed to PyPI. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
# Exporting an MBox from Email | ||
|
||
One easy place to obtain a communications network to perform graph analyses is your email. Tribe extracts the relationships between unique email addresses by exploring who is connected by participating in the same email address. In particular, we will use a common format for email storage called `mbox`. If you have Apple Mail, Thunderbird, or Microsoft Outlook, you should be able to export your mbox. If you have [Gmail](https://gmail.com) you may have to use an online email extraction tool. | ||
|
||
## Gmail or Google Apps | ||
|
||
**Note, if you're taking the DDL Workshop, make sure you do this in advance of the class, it can take hours or even days for the archive to be created!** | ||
|
||
1. Go to [https://takeout.google.com/settings/takeout](https://takeout.google.com/settings/takeout). | ||
2. In the "select data to include" box, make sure Mail is turned on and everything else is turned off, then click Next. | ||
3. Select your compression format (zip for Windows, tgz for Mac) and click Create Archive. | ||
4. Once the archive has been created, you will receive an email notification. | ||
|
||
## Outlook | ||
1. Select the messages you would like to export, or the folder, if you would like to export the entire folder. | ||
2. Click the MessageSave Outlook toolbar button. | ||
3. Select "include subfolders" if you would like to export subfolders of the current folder as well. | ||
4. Select "MBOX" in the "Format" field.Click "Save Now". | ||
5. That's it. You should see mbox file(s) created in the destination directory. | ||
6. MessageSave creates one file per Outlook folder processed. | ||
|
||
## Thunderbird | ||
|
||
1. Go to the [Import/Export Tools website](https://addons.mozilla.org/en-US/thunderbird/addon/importexporttools/). | ||
2. Right-click on the download link and select "Save Target/Link As." | ||
3. Save the ".xpi" file to your computer's hard disk and note the location. | ||
4. Start up Thunderbird and select "Add-ons" from the "Tools" menu. | ||
5. Click "Extensions" in the new window and click "Install." | ||
6. Browse to your saved ImportExport Tools ".xpi" file and click "Open." | ||
7. Click the "Install Now" button and close Thunderbird. | ||
8. Restart Thunderbird and select "ImportExport Tools" from the "Tools" menu. Click "Options." | ||
9. Select the "Export Directories" tab. Check the box next to "Export folders as MBOX file." | ||
10. Browse to the drive and folder to which you want to export your mbox files. Click "OK" twice. | ||
11. Select "ImportExport Tools" from the "Tools" menu again. Click on "Export all the folders." | ||
12. Choose a folder from Thunderbird's collective "Profiles" folder and its contents will be exported as mbox files. | ||
|
||
## Apple Mail | ||
|
||
1. Select one or more mailboxes to export. | ||
|
||
To select mailboxes that are next to each other (contiguous) in the list, hold down Shift as you click the first and last mailbox. To select mailboxes that are not next to each other in the list, hold down Command as you click each mailbox. | ||
|
||
2. Choose Mailbox > Export Mailbox, or choose Export Mailbox from the Action pop-up menu (looks like a gear) at the bottom of the sidebar. | ||
3. Choose a folder or create a new folder where you want to store the exported mailbox, and then click Choose. | ||
|
||
Mail exports the mailboxes as .mbox packages. If you previously exported a mailbox, Mail does not overwrite the existing .mbox file but appends a number to the filename of the new export to create a new version, such as My Mailbox 3.mbox. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,25 @@ | ||
# Tribe Documentation | ||
|
||
Documentation to follow soon! | ||
Social networks are not new, even though websites like Facebook and Twitter might make you want to believe they are; and trust me- I’m not talking about Myspace! Social networks are extremely interesting models for human behavior, whose study dates back to the early twentieth century. However, because of those websites, data scientists have access to much more data than the anthropologists who studied the networks of tribes! | ||
|
||
Because networks take a relationship-centered view of the world, the data structures that we will analyze model real world behaviors and community. Through a suite of algorithms derived from mathematical Graph theory we are able to compute and predict behavior of individuals and communities through these types of analyses. Clearly this has a number of practical applications from recommendation to law enforcement to election prediction, and more. | ||
|
||
Tribe is a utility that will allow you to extract a network (a graph) from a communication network that we all use often - our email. Tribe is designed to read an email mbox (a native format for email in Python)and write the resulting graph to a GraphML file on disk. This utility is generally used for District Data Labs' Graph Analytics with Python and NetworkX course, but can be used for anyone interested in studying networks. | ||
|
||
## Quick Start | ||
|
||
1. Download your data. See [Extracting an MBox from Email](emails.md) for more information on how to accomplish this. | ||
|
||
2. Install the tribe utility with `pip`: | ||
|
||
$ pip install tribe | ||
|
||
3. If you would like to develop for tribe, please see the instructions in the README. | ||
|
||
4. Extract a graph from your email MBox as follows: | ||
|
||
$ tribe-admin.py extract -w myemails.grpahml myemails.mbox | ||
|
||
5. Be patient, this could take some time, on my Macbook Pro it took 12 minutes to perform the complete extraction on an MBox that was 7.5 GB. | ||
|
||
You're now ready to get started analyzing your email network! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,12 @@ | ||
site_name: My Docs | ||
site_name: Tribe | ||
repo_name: GitHub | ||
repo_url: https://github.com/DistrictDataLabs/tribe | ||
site_description: Tribe extracts a network from an email mbox and writes it to a graphml file for visualization and analysis. | ||
site_author: District Data Labs | ||
copyright: Built by District Data Labs, licensed by <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/80x15.png" /></a> | ||
theme: readthedocs | ||
|
||
pages: | ||
- "Introduction": index.md | ||
- "Aquiring an Email MBox": emails.md | ||
- "About Tribe": about.md |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters