diff --git a/README.rst b/README.rst index 3d79018..739ab47 100644 --- a/README.rst +++ b/README.rst @@ -113,23 +113,24 @@ The contains: To use the Memory back-end plug, include the following in the : -.. code-block:: json +.. code-block:: text { "backend": { "module_class": "MemoryBackend", - "filename": "" + "filename": } } To use the Mongo DB back-end plug, include the following in the : -.. code-block:: json +.. code-block:: text { "backend": { "module_class": "MongoBackend", - "uri": " # e.g., 'mongodb://localhost:27017/'" + "uri": # e.g., 'mongodb://localhost:27017/' + "filename": } } @@ -144,7 +145,7 @@ the in plain text. Here is an example: -.. code-block:: json +.. code-block:: text { "users": { @@ -161,12 +162,12 @@ Authorization could be enhanced by changing the method "decorated" using Configs may also contain a "taxii" section as well, as shown below: -.. code-block:: json +.. code-block:: text { "taxii": { "max_page_size": 100 - "interop_requirements": true + "interop_requirements": True/False # the TAXII interop document has some additional requirements } } @@ -181,23 +182,22 @@ We welcome contributions for other back-end plugins. Docker ------ -We also provide a Docker image to make it easier to run *medallion* +We also provide a Docker image to make it easier to run *medallion* with the MongoDB backend. Use the --build argument +if the code has changed. .. code-block:: bash - $ docker build . -t medallion -f docker_utils/Dockerfile + $ docker-compose up [--build] -If operating behind a proxy, add the following option (replacing `` with -your proxy location and port): ``--build-arg https_proxy=``. +This uses the information in docker-compose.yml to create a Docker container with medallion, mongo db and mongo-express -Then run the image +If operating behind a proxy, add the following to the medallion:build section of docker-compose.yml: -.. code-block:: bash +.. code-block:: text - $ docker run --rm -p 5000:5000 -v :/var/taxii medallion + HTTPS_PROXY: -Replace ```` with the full path to the directory containing your -medallion configuration. +replacing with your proxy location and port. Governance ---------- diff --git a/docs/mongodb_schema.rst b/docs/mongodb_schema.rst index ccc0aeb..bb38927 100644 --- a/docs/mongodb_schema.rst +++ b/docs/mongodb_schema.rst @@ -8,9 +8,22 @@ Each Mongo database contains one or more collections. The term "collection" in It is unfortunate that the term "collection" is also used to signify something unrelated in the TAXII specification. We will use the phrase "taxii collection" to distinguish them. -An instance of this schema can be populated via the file test/data/initialize_mongodb.py. This instance will be used for examples below. +You can initialize the database with content, by specifying a json file in the backend section of the medallion configuration. -Utilities to initialize your own Mongo DB can be found in test/generic_initialize_mongodb.py. +To initialize the database for testing use mediallion/test/data/default_data.json. Use the format of that file to determine how +to initialize with your own data. + +For example: + +.. code-block:: text + + { + "backend": { + "module_class": "MongoBackend", + "uri": # e.g., 'mongodb://localhost:27017/' + "filename": + } + } The discovery database ---------------------- @@ -55,7 +68,11 @@ Here is a document from the example database: The api root databases ---------------------- -Each api root is contained in a separate Mongo DB database. It has four collections: **status**, **objects**, **manifests**, and **collections**. To support multiple taxii collections, any document in the **objects** and **manifests** contains an extra property, "collection_id", to link it to the taxii collection that it is contained in. Because "_collection_id" property is not part of the TAXII specification, it will be stripped by *medallion* before any document is returned to the client. +Each api root is contained in a separate Mongo DB database. It has four collections: **status**, **objects**, +and **collections**. To support multiple TAXII collections, any document in the **objects** contains an extra +property, "collection_id", to link it to the taxii collection that it is contained in. +Because "_collection_id" property is not part of the TAXII specification, it will be stripped by *medallion* +before any document is returned to the client. A document from the **collections** collection: @@ -72,23 +89,32 @@ A document from the **collections** collection: ] } +Because the STIX objects and the manifest entries share so much information, the manifest is stored with the object. + A document from the **objects** collection: .. code-block:: json { - "created": "2014-05-08T09:00:00.000Z", - "id": "indicator--a932fcc6-e032-176c-126f-cb970a5a1ade", - "labels": [ - "file-hash-watchlist" - ], - "modified": "2014-05-08T09:00:00.000Z", - "name": "File hash for Poison Ivy variant", - "pattern": "[file:hashes.'SHA-256' = 'ef537f25c895bfa782526529a9b63d97aa631564d5d789c2b765448c8635fb6c']", - "type": "indicator", - "valid_from": "2014-05-08T09:00:00.000000Z", - "_collection_id": "91a7b528-80eb-42ed-a74d-c6fbd5a26116" - } + "created": "2017-01-27T13:49:53.997Z", + "description": "Poison Ivy", + "id": "malware--c0931cc6-c75e-47e5-9036-78fabc95d4ec", + "is_family": True, + "malware_types": [ + "remote-access-trojan", + ], + "modified": "2017-01-27T13:49:53.997Z", + "name": "Poison Ivy", + "spec_version": "2.1", + "type": "malware", + "_collection_id": "91a7b528-80eb-42ed-a74d-c6fbd5a26116", + "_manifest": { + "date_added": "2017-01-27T13:49:59.997000Z", + "id": "malware--c0931cc6-c75e-47e5-9036-78fabc95d4ec", + "media_type": "application/stix+json;version=2.1", + "version": "2017-01-27T13:49:53.997Z" + } + } A document from the **status** collection: @@ -117,18 +143,5 @@ A document from the **status** collection: ] } -A document from the **manifest** collection: -.. code-block:: json - { - "id": "indicator--a932fcc6-e032-176c-126f-cb970a5a1ade", - "date_added": "2016-11-01T10:29:05Z", - "versions": [ - "2014-05-08T09:00:00.000Z" - ], - "media_types": [ - "application/vnd.oasis.stix+json; version=2.0" - ], - "_collection_id": "91a7b528-80eb-42ed-a74d-c6fbd5a26116" - } diff --git a/medallion/common.py b/medallion/common.py index 364bf6d..ff1f2bf 100644 --- a/medallion/common.py +++ b/medallion/common.py @@ -233,14 +233,17 @@ def find_att(obj): string value of the field from the object to use for versioning """ - if "version" in obj: - return string_to_datetime(obj["version"]) - elif "modified" in obj: + # check for STIX object properties first, then for version in the manifest + if "modified" in obj: return string_to_datetime(obj["modified"]) elif "created" in obj: return string_to_datetime(obj["created"]) - else: + elif "_date_added" in obj: return string_to_datetime(obj["_date_added"]) + elif "version" in obj: + return string_to_datetime(obj["version"]) + else: + raise ValueError("Unable to determine the version attribute of {}".format(obj)) def find_version_attribute(obj): diff --git a/medallion/test/data/default_data.json b/medallion/test/data/default_data.json index 926c044..9aa88f7 100644 --- a/medallion/test/data/default_data.json +++ b/medallion/test/data/default_data.json @@ -124,6 +124,25 @@ "application/stix+json;version=2.1" ], "objects": [ + { + "type": "malware-analysis", + "spec_version": "2.1", + "id": "malware-analysis--084a658c-a7ef-4581-a21d-1f600908741b", + "created_by_ref": "identity--eae683c1-d472-4708-bd63-f9b1a1f016b1", + "created": "2021-04-16T09:49:24.378932Z", + "modified": "2021-12-11T07:17:44.542582Z", + "product": "option", + "version": "moment", + "submitted": "2022-03-26T15:06:01.434493Z", + "analysis_started": "2022-10-04T07:07:55.365672Z", + "analysis_ended": "2023-06-14T07:12:00.962419Z", + "result": "unknown", + "lang": "en", + "confidence": 16, + "object_marking_refs": [ + "marking-definition--3e914a0d-957f-40b2-8c35-b119040574fe" + ] + }, { "created": "2014-05-08T09:00:00.000Z", "modified": "2014-05-08T09:00:00.000Z", @@ -278,6 +297,12 @@ "id": "indicator--6770298f-0fd8-471a-ab8c-1c658a46574e", "media_type": "application/stix+json;version=2.1", "version": "2017-01-27T13:49:53.935Z" + }, + { + "date_added": "2022-06-16T13:49:53.935000Z", + "id": "malware-analysis--084a658c-a7ef-4581-a21d-1f600908741b", + "media_type": "application/stix+json;version=2.1", + "version": "2021-12-11T07:17:44.542582Z" } ] }, diff --git a/medallion/test/test_backends.py b/medallion/test/test_backends.py index 5353547..8ea4581 100644 --- a/medallion/test/test_backends.py +++ b/medallion/test/test_backends.py @@ -119,18 +119,19 @@ def test_get_objects(backend): assert r.content_type == MEDIA_TYPE_TAXII_V21 objs = r.json assert objs['more'] is False - assert len(objs['objects']) == 5 + assert len(objs['objects']) == 6 # testing date-added headers assert r.headers['X-TAXII-Date-Added-First'] == "2014-05-08T09:00:00.000000Z" - assert r.headers['X-TAXII-Date-Added-Last'] == "2017-12-31T13:49:53.935000Z" + assert r.headers['X-TAXII-Date-Added-Last'] == "2022-06-16T13:49:53.935000Z" # testing ordering of returned objects by date_added correct_order = ['relationship--2f9a9aa9-108a-4333-83e2-4fb25add0463', 'indicator--cd981c25-8042-4166-8945-51178443bdac', 'marking-definition--34098fce-860f-48ae-8e50-ebd3cc5e41da', 'malware--c0931cc6-c75e-47e5-9036-78fabc95d4ec', - 'indicator--6770298f-0fd8-471a-ab8c-1c658a46574e'] + 'indicator--6770298f-0fd8-471a-ab8c-1c658a46574e', + "malware-analysis--084a658c-a7ef-4581-a21d-1f600908741b"] for x in range(0, len(correct_order)): assert objs['objects'][x]['id'] == correct_order[x] @@ -257,11 +258,11 @@ def test_get_object_manifests(backend): assert r.status_code == 200 assert r.content_type == MEDIA_TYPE_TAXII_V21 manifests = r.json - assert len(manifests["objects"]) == 5 + assert len(manifests["objects"]) == 6 # testing the date-added headers assert r.headers['X-TAXII-Date-Added-First'] == "2014-05-08T09:00:00.000000Z" - assert r.headers['X-TAXII-Date-Added-Last'] == "2017-12-31T13:49:53.935000Z" + assert r.headers['X-TAXII-Date-Added-Last'] == "2022-06-16T13:49:53.935000Z" # checking ordered by date_added @@ -296,7 +297,7 @@ def test_get_objects_added_after(backend): assert r.content_type == MEDIA_TYPE_TAXII_V21 objs = r.json assert objs['more'] is False - assert len(objs['objects']) == 3 + assert len(objs['objects']) == 4 def test_get_objects_limit(backend): @@ -329,10 +330,10 @@ def test_get_objects_limit(backend): assert r.content_type == MEDIA_TYPE_TAXII_V21 objs = r.json assert objs['more'] is False - assert len(objs['objects']) == 2 + assert len(objs['objects']) == 3 assert r.headers['X-TAXII-Date-Added-First'] == '2017-01-27T13:49:59.997000Z' - assert r.headers['X-TAXII-Date-Added-Last'] == '2017-12-31T13:49:53.935000Z' + assert r.headers['X-TAXII-Date-Added-Last'] == '2022-06-16T13:49:53.935000Z' correct_order = ['malware--c0931cc6-c75e-47e5-9036-78fabc95d4ec', 'indicator--6770298f-0fd8-471a-ab8c-1c658a46574e'] @@ -408,7 +409,7 @@ def test_objects_version_match_last(backend): def test_objects_version_match_all(backend): objs = get_objects_by_version(backend, "?match[version]=all") - assert len(objs['objects']) == 7 + assert len(objs['objects']) == 8 def get_objects_spec_version(backend, filter, num_objects): @@ -431,16 +432,16 @@ def test_get_objects_spec_version_20(backend): def test_get_objects_spec_version_21_20(backend): - get_objects_spec_version(backend, "?match[spec_version]=2.0,2.1", 5) + get_objects_spec_version(backend, "?match[spec_version]=2.0,2.1", 6) def test_get_objects_spec_version_21(backend): - objs = get_objects_spec_version(backend, "?match[spec_version]=2.1", 5) + objs = get_objects_spec_version(backend, "?match[spec_version]=2.1", 6) assert all(obj['spec_version'] == "2.1" for obj in objs['objects']) def test_get_objects_spec_version_default(backend): - objs = get_objects_spec_version(backend, "", 5) + objs = get_objects_spec_version(backend, "", 6) assert all(obj['spec_version'] == "2.1" for obj in objs['objects']) @@ -600,7 +601,7 @@ def test_get_manifest_added_after(backend): objs = r.json assert objs['more'] is False # only 2 because one is v2.0 - assert len(objs['objects']) == 2 + assert len(objs['objects']) == 3 def test_get_manifest_limit(backend): @@ -642,7 +643,7 @@ def test_get_manifest_limit(backend): assert r.content_type == MEDIA_TYPE_TAXII_V21 objs = r.json assert objs['more'] is False - assert len(objs['objects']) == 1 + assert len(objs['objects']) == 2 assert r.headers['X-TAXII-Date-Added-First'] == objs['objects'][0]['date_added'] assert r.headers['X-TAXII-Date-Added-Last'] == objs['objects'][-1]['date_added'] @@ -703,7 +704,7 @@ def test_get_manifest_version_specific(backend): def test_get_manifest_version_first(backend): object_id = "indicator--6770298f-0fd8-471a-ab8c-1c658a46574e" objs = get_manifest_version(backend, "?match[version]=first") - assert len(objs['objects']) == 5 + assert len(objs['objects']) == 6 for obj in objs['objects']: if obj['id'] == object_id: assert obj['version'] == "2016-11-03T12:30:59.000Z" @@ -712,7 +713,7 @@ def test_get_manifest_version_first(backend): def test_get_manifest_version_last(backend): object_id = "indicator--6770298f-0fd8-471a-ab8c-1c658a46574e" objs = get_manifest_version(backend, "?match[version]=last") - assert len(objs['objects']) == 5 + assert len(objs['objects']) == 6 for obj in objs['objects']: if obj['id'] == object_id: assert obj['version'] == "2017-01-27T13:49:53.935Z" @@ -720,7 +721,7 @@ def test_get_manifest_version_last(backend): def test_get_manifest_version_all(backend): objs = get_manifest_version(backend, "?match[version]=all") - assert len(objs['objects']) == 7 + assert len(objs['objects']) == 8 def get_manifest_spec_version(backend, filter): @@ -744,14 +745,14 @@ def test_manifest_spec_version_20(backend): def test_manifest_spec_version_21(backend): objs = get_manifest_spec_version(backend, "?match[spec_version]=2.1") - assert len(objs['objects']) == 5 + assert len(objs['objects']) == 6 assert all(obj['media_type'] == "application/stix+json;version=2.1" for obj in objs['objects']) def test_manifest_spec_version_2021(backend): objs = get_manifest_spec_version(backend, "?match[spec_version]=2.0,2.1") # though the spec_version filter is getting all objects, the automatic filtering by version only gets the latest objects - assert len(objs['objects']) == 5 + assert len(objs['objects']) == 6 for obj in objs['objects']: if obj['id'] == "malware--c0931cc6-c75e-47e5-9036-78fabc95d4ec": assert obj['version'] == "2018-02-23T18:30:00.000Z" @@ -760,7 +761,7 @@ def test_manifest_spec_version_2021(backend): def test_manifest_spec_version_default(backend): objs = get_manifest_spec_version(backend, "") # testing default value - assert len(objs['objects']) == 5 + assert len(objs['objects']) == 6 assert all(obj['media_type'] == "application/stix+json;version=2.1" for obj in objs['objects']) diff --git a/sample-config-with-memory-backend.json b/sample-config-with-memory-backend.json index 3515deb..4fdf940 100644 --- a/sample-config-with-memory-backend.json +++ b/sample-config-with-memory-backend.json @@ -1,7 +1,7 @@ { "backend": { "module_class": "MemoryBackend", - "filename": "medallion/test/data/default_data.json" + "filename": "../test/data/default_data.json" }, "users": { "admin": "Password0",