Skip to content

Latest commit

 

History

History
149 lines (109 loc) · 4.5 KB

README-arctic.md

File metadata and controls

149 lines (109 loc) · 4.5 KB

Documentation Status CircleCI PyPI Python

Quickstart

Install Arctic

pip install git+https://github.com/man-group/arctic.git

Run a MongoDB

mongod --dbpath <path/to/db_directory>

Using VersionStore

from arctic import Arctic import quandl

Connect to Local MONGODB

store = Arctic('localhost')

Create the library - defaults to VersionStore

store.initialize_library('NASDAQ')

Access the library

library = store['NASDAQ']

Load some data - maybe from Quandl

aapl = quandl.get("WIKI/AAPL", authtoken="your token here")

Store the data in the library

library.write('AAPL', aapl, metadata={'source': 'Quandl'})

Reading the data

item = library.read('AAPL') aapl = item.data metadata = item.metadata

VersionStore supports much more: See the HowTo!

Adding your own storage engine

Plugging a custom class in as a library type is straightforward. This example shows how.

Documentation

You can find complete documentation at Arctic docs

Concepts

Libraries

Arctic provides namespaced libraries of data. These libraries allow bucketing data by source, user or some other metric (for example frequency: End-Of-Day; Minute Bars; etc.).

Arctic supports multiple data libraries per user. A user (or namespace) maps to a MongoDB database (the granularity of mongo authentication). The library itself is composed of a number of collections within the database. Libraries look like:

  • user.EOD
  • user.ONEMINUTE

A library is mapped to a Python class. All library databases in MongoDB are prefixed with 'arctic_'

Storage Engines

Arctic includes three storage engines:

  • VersionStore: a key-value versioned TimeSeries store. It supports:
    • Pandas data types (other Python types pickled)
    • Multiple versions of each data item. Can easily read previous versions.
    • Create point-in-time snapshots across symbols in a library
    • Soft quota support
    • Hooks for persisting other data types
    • Audited writes: API for saving metadata and data before and after a write.
    • a wide range of TimeSeries data frequencies: End-Of-Day to Minute bars
    • See the HowTo
    • Documentation
  • TickStore: Column oriented tick database. Supports dynamic fields, chunks aren't versioned. Designed for large continuously ticking data.
  • Chunkstore: A storage type that allows data to be stored in customizable chunk sizes. Chunks aren't versioned, and can be appended to and updated in place.

Arctic storage implementations are pluggable. VersionStore is the default.

Requirements

Arctic currently works with:

  • python 3.6, 3.7, 3.8
  • pymongo >= 3.6.0 <= 3.11.0
  • pandas >= 0.22.0 < 2
  • MongoDB >= 2.4.x <= 4.4.18

Operating Systems:

  • Linux
  • macOS
  • Windows 10

Acknowledgements

Arctic has been under active development at Man Group since 2012.

It wouldn't be possible without the work of the Man Data Engineering Team including:

Contributions welcome!

License

Arctic is licensed under the GNU LGPL v2.1. A copy of which is included in LICENSE