Skip to content

Commit

Permalink
Merge pull request #394 from crocs-muni/document-reference-context-in…
Browse files Browse the repository at this point in the history
…ference

document reference context inference
  • Loading branch information
adamjanovsky authored Feb 20, 2024
2 parents 7ee9d2f + bb23ee6 commit 9f02cf2
Show file tree
Hide file tree
Showing 7 changed files with 23 additions and 11 deletions.
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ All commits shall pass the lint pipeline of the following tools:
- Mypy (see [pyproject.toml](https://github.com/crocs-muni/sec-certs/blob/main/pyproject.toml) for settings)
- Ruff (see [pyproject.toml](https://github.com/crocs-muni/sec-certs/blob/main/pyproject.toml) for settings)

These tools can be installed via [dev_requirements.txt](https://github.com/crocs-muni/sec-certs/blob/main/dev_requirements.txt) You can use [pre-commit](https://pre-commit.com/) tool to register git hook that will evalute these checks prior to any commit and abort the commit for you. Note that the pre-commit is not meant to automatically fix the issues, just warn you.
These tools can be installed via [dev_requirements.txt](https://github.com/crocs-muni/sec-certs/blob/main/requirements/dev_requirements.txt) You can use [pre-commit](https://pre-commit.com/) tool to register git hook that will evalute these checks prior to any commit and abort the commit for you. Note that the pre-commit is not meant to automatically fix the issues, just warn you.

It should thus suffice to:

Expand Down
3 changes: 3 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ help:

.PHONY: help Makefile

linkcheck:
@$(SPHINXBUILD) -b linkcheck . $(BUILDDIR)/linkcheck

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
Expand Down
11 changes: 11 additions & 0 deletions docs/user_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,14 @@ nvd_api_key: null # or the actual key value
```
If you aim to fetch the sources from NVD, we advise you to get an [NVD API key](https://nvd.nist.gov/developers/request-an-api-key) and set the `nvd_api_key` setting accordingly. The download from NVD will work even without API key, it will just be slow. No API key is needed when `preferred_source_nvd_datasets: "sec-certs"`


## Infering inter-certificate reference context

```{important}
This is an experimental feature.
```

We provide a model that can predict the context of inter-certificate references based on the text embedded in the artifacts. The model output is not incorporated into the `CCCertificate` instances, but can be dumped into a `.csv` file from where it can be correlated with a DataFrame of certificate features.

To train and deploy the model, it should be sufficient to change some paths and run the [prediction notebook](https://github.com/crocs-muni/sec-certs/blob/main/notebooks/cc/reference_annotations/prediction.ipynb). The output of this notebook is a `prediction.csv` file that can be loaded into the [references notebook](https://github.com/crocs-muni/sec-certs/blob/main/notebooks/cc/references.ipynb). This notebook documents the full analysis of references conducted on the Common Criteria certificates. Among others, the notebook generates some further `.csv` files that can subsequently be plotted via [plotting notebook](https://github.com/crocs-muni/sec-certs/blob/main/notebooks/cc/paper2_plots.ipynb).
6 changes: 3 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@
sec-certs = "sec_certs.cli:main"

[tool.ruff]
select = [
lint.select = [
"I", # isort
"E", # pycodestyle
"W", # pycodestyle
Expand All @@ -113,14 +113,14 @@
"C4", # comprehensions
"SIM",
]
ignore = [
lint.ignore = [
"E501", # line-length, should be handled by ruff format
]
src = ["src", "tests"]
line-length = 120
target-version = "py310"

[tool.ruff.mccabe]
[tool.ruff.lint.mccabe]
max-complexity = 10

[tool.setuptools.package-data]
Expand Down
6 changes: 3 additions & 3 deletions src/sec_certs/sample/cpe.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,16 +32,16 @@ def __lt__(self, other: CPEMatchCriteria) -> bool:

@classmethod
def from_nist_dict(cls, dct: dict[str, Any]) -> CPEMatchCriteria:
if dct.get("versionStartIncluding", None):
if dct.get("versionStartIncluding"):
version_start = ("including", dct["versionStartIncluding"])
elif dct.get("versionStartExcluding"):
version_start = ("excluding", dct["versionStartExcluding"])
else:
version_start = None

if dct.get("versionEndIncluding", None):
if dct.get("versionEndIncluding"):
version_end = ("including", dct["versionEndIncluding"])
elif dct.get("versionEndExcluding", None):
elif dct.get("versionEndExcluding"):
version_end = ("excluding", dct["versionEndExcluding"])
else:
version_end = None
Expand Down
2 changes: 1 addition & 1 deletion src/sec_certs/sample/fips.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ def get_web_data_and_algorithms(self) -> tuple[set[str], FIPSCertificate.WebData
def _build_details_dict(self, details_div: Tag) -> dict[str, Any]:
def parse_single_detail_entry(key, entry):
normalized_key = DETAILS_KEY_NORMALIZATION_DICT[key]
normalization_func = DETAILS_KEY_TO_NORMALIZATION_FUNCTION.get(normalized_key, None)
normalization_func = DETAILS_KEY_TO_NORMALIZATION_FUNCTION.get(normalized_key)
normalized_entry = (
FIPSHTMLParser.normalize_string(entry.text) if not normalization_func else normalization_func(entry)
)
Expand Down
4 changes: 1 addition & 3 deletions src/sec_certs/utils/extract.py
Original file line number Diff line number Diff line change
Expand Up @@ -727,16 +727,14 @@ def load_text_file(

whole_text = ""
whole_text_with_newlines = ""
lines_included = 0
for line in lines:
for lines_included, line in enumerate(lines):
if limit_max_lines != -1 and lines_included >= limit_max_lines:
break

whole_text_with_newlines += line
line = line.replace("\n", "")
whole_text += line
whole_text += line_separator
lines_included += 1

return whole_text, whole_text_with_newlines, was_unicode_decode_error

Expand Down

0 comments on commit 9f02cf2

Please sign in to comment.