-
Notifications
You must be signed in to change notification settings - Fork 138
docs: arch: 2nd and 3rd party plugins #1061
Conversation
cb7ce5a
to
6ede99b
Compare
6ede99b
to
5354f23
Compare
Pinging @yashlamba @sakshamarora1 for re-review |
We need to make sure to account for: How do we deal with different Python versions. For example, it looks like TensorFlow doesn't support 3.9 yet. |
Co-authored-by: Yash Lamba <[email protected]> Co-authored-by: Saksham Arora <[email protected]> Signed-off-by: John Andersen <[email protected]>
5354f23
to
7d78ca8
Compare
|
|
Three manifests
After that there can be acceptance criteria on the output of each operation or as a set (dont mind if these parts fail) |
webhook service which checks depenency tree of project for incoming webhook and dispatches downstream validation for other projects Related: #1315 |
|
||
- 0 main package | ||
|
||
- 1 2nd party required pass for release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This corresponds to [models]
in https://intel.github.io/dffml/installation.html#installing-all-plugins
|
||
essentially trigger a domino effect where we analyze the requirements files of all of the plugins that are either first or second party possibly support third party later somehow uhm and we build a dependency tree to understand which packages or which plugins are dependent on the plug in which is being changed in the original pull request we run the validation for the original pull request and then we run validation against you we trigger all of the CI runs of all of the downstream projects with the PR applied to with the original PR applied at if any of the downstream repos have would need to be changed for their CI to pass we can create PR's against those repos in the original PR we can provide overrides for each dependency so that when we trigger the validation or not dependency but downstream package so that when we trigger the validation for each downstream package we can say use this PR so if you've made an API breaking change and you need to go through all of the downstream dependencies are and make changes and submit PR that would make it OK then you go and then you specify you know all of those PRs which will be used when running the CI of the downstream dependencies respectively | ||
|
||
We should also make sure to support 3rd party plugin's abilities to revalidate against any of their dependencies, whenever one of their dependencies changes. Possibly some kind of service people can set as a webhook which is a sort of pubsub. The SCM sever such as GitHub publishes webhook events to the service (`dffml-service-sw-src-change-notify`). The service then relays to any listeners. Listeners are downstream projects. Downstream projects can register themselves with the listener to receive change events for any of their dependencies. Registration involves plugin based configurable callbacks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Could create DIDs for each event (change)
- Update 2023-03-21: Doing ActivityPub for now, ideally we content address those statuses or event better hardware rooted DIDs (
did:keri
) as status IDs
- Update 2023-03-21: Doing ActivityPub for now, ideally we content address those statuses or event better hardware rooted DIDs (
Contributing To a 2nd Party Plugin$ git clone https://github.com/owner/repo
$ cd repo
$ python -m venv .venv
$ echo On Windows run .\.venv\Scripts\activate instead of next command
$ . .venv/bin/activate # Windows: .\.venv\Scripts\activate
$ python -m pip install -U pip setuptools wheel distlib
$ python -m pip install -U \
-e .[dev] \
"https://github.com/intel/dffml/archive/main.zip#egg=dffml" If you are locally developing on DFFML as well, then also run the following.
$ mkdir -p ~/Documents/python
$ git clone -b some-branch https://github.com/intel/dffml ~/Documents/python/dffml
$ python -m pip uninstall -y dffml
$ SETUPTOOLS_USE_DISTUTILS=stdlib python -m pip install -U \
-e ~/Documents/python/dffml
$ cd ~/Documents/python/dffml
$ python -m venv .venv
$ echo On Windows run .\.venv\Scripts\activate instead of next command
$ . .venv/bin/activate # Windows: .\.venv\Scripts\activate
$ python -m pip install -U pip setuptools wheel distlib
$ SETUPTOOLS_USE_DISTUTILS=stdlib python -m pip install -U \
-e .[dev] You now have a separate virtual environment for working on DFFML and for this References: |
Didn't mean to close this, Related, Alice, who will help us help our ecosystem: #1399 & #1401 & #1207 Rolling Alice: Progress Report 1: Where are we
Okay, so we are going to do our series of tutorials on building Alice, our |
|
||
PR validation | ||
|
||
essentially trigger a domino effect where we analyze the requirements files of all of the plugins that are either first or second party possibly support third party later somehow uhm and we build a dependency tree to understand which packages or which plugins are dependent on the plug in which is being changed in the original pull request we run the validation for the original pull request and then we run validation against you we trigger all of the CI runs of all of the downstream projects with the PR applied to with the original PR applied at if any of the downstream repos have would need to be changed for their CI to pass we can create PR's against those repos in the original PR we can provide overrides for each dependency so that when we trigger the validation or not dependency but downstream package so that when we trigger the validation for each downstream package we can say use this PR so if you've made an API breaking change and you need to go through all of the downstream dependencies are and make changes and submit PR that would make it OK then you go and then you specify you know all of those PRs which will be used when running the CI of the downstream dependencies respectively |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to nessicarily update status checks via API, can just have a pipeline within PR workflows which says this other PR must be merged in an upstrema or downstrema before this one can auto merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- https://github.blog/changelog/2023-02-08-pull-request-merge-queue-public-beta/ (removes need for custom locking bot on PRs mentioned)
- The wait for message on ActivityPub will enable our poly repo merge queue leveraging github native merge queues per repo.
schema/github/actions/result/container/example-pull-request-validation.yaml
$schema: "https://github.com/intel/dffml/raw/dffml/schema/github/actions/result/container/0.0.0.schema.json"
commit_url: "https://github.com/intel/dffml/commit/1f347bc7f63f65041a571d9e3c174d8b9ead24aa"
job_url: "https://github.com/intel/dffml/actions/runs/4185582030/jobs/7252852590"
result: "docker.io/intelotc/dffml@sha256:ae636f72f96f499ff5206150ebcaafbd64ce30affa7560ce0a41f54e871da2"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2023-02-23 @pdxjohnny Engineering Logs
- https://github.com/cloudfoundry-community/node-cfenv
- Eventting helps us have Alice sit alongside and look at new issues, workflow runs, etc. This will help her help developers stay away from known bad/unhelpful trains of thought.
- She can look at issue bodies for similar stack traces
- Eventually we'll have the updating like we do where we update issue or discussion thread with what console commands and outputs we run while debugging, or we'll just do peer to peer depending on context!
- docs: arch: Inventory #1207
- live at HEAD is great, but poly repo PR validation will bring us into the future, since we'll be running inference over all the active pull requests
- We'll take this further to branches, then to the in progress trains of thought (active debug, states of the art which gatekeeper/umbrella/prioriziter says are active based on overlays for context of scientific exploration)
- As our inference gets better, we'll look across the trains of thought and
Prohpet.predict()
state of the art trains of thought, then validate those via dispatch/distributed compute, then we'll start to just infer the outputs of the distributed compute, and validate based on risk and criticality, we'll then have our best guess muscle memory machine.
- As our inference gets better, we'll look across the trains of thought and
- We'll take this further to branches, then to the in progress trains of thought (active debug, states of the art which gatekeeper/umbrella/prioriziter says are active based on overlays for context of scientific exploration)
- live at HEAD is great, but poly repo PR validation will bring us into the future, since we'll be running inference over all the active pull requests
- She can look at issue bodies for similar stack traces
- Mermaid has mind map functionality now
- https://www.youtube.com/watch?v=tXJ03mPChYo&t=375s
- Alice helps us understand the security posture of this whole stack over it's lifecycle. She's trying to help us understand the metrics and models produced from analysis of our software and improve it in arbitrary areas (via overlays). She has overlays for dependency analysis and deciding if there is anything she can do to help improve those dependencies.
alice threats
will be where she decides if those changes or the stats mined from shouldi are aligned to her strategic principles, we'll also look to generate threat models based on analysis of dependencies found going down the rabbit hole again with alice shouldi (shouldi: deptree: Create dependency tree of project #596). These threat models can then be improved via running https://github.com/johnlwhiteman/living-threat-models auditor.pyalice threats audit
, threats are inherently strategic, based on deployment context, they require knowledge of the code (static), past behavior (pulled from event stream of distributed compute runs), and understanding of what deployments are relavent for vuln analysis per the threat model.- Entity, infrastructure (methodology for traversal and chaining), (open) architecture
- What are you running (+deps), where are you running it (overlayed deployment, this is evaluated in federated downstream SCITT for applicablity and reissusance of VEX/VDR by downstream), and what's the upstream threat model telling you if you should care if what your running and how your running it yields unmittigated threats. If so, and Alice knows how to contribute, Alice please contribute. If not and Alice doesn't know how to contribute. Alice please log todos, across org relevant poly repos.
- When we do our depth of field mapping (ref early engineering log streams) we'll merge all the event stream analysis via the tuned brute force prioritizer (grep alice discussion arch)
- Alice helps us understand the security posture of this whole stack over it's lifecycle. She's trying to help us understand the metrics and models produced from analysis of our software and improve it in arbitrary areas (via overlays). She has overlays for dependency analysis and deciding if there is anything she can do to help improve those dependencies.
- Loosly coupled DID VC CI/CD enables AI in the loop development in a decentralized poly repo environment (Open Source Software cross orgs)
WIP: IETF SCITT: Use Case: OpenSSF Metrics: activitypub extensions for security.txt
- Fork and exec over ActivityPub over DWN CLI
- https://github.com/soda480/wait-for-message-action
- TODO Fork and add ActivityPub support
- Usage of this in a
needs
->matrix/workflow_run/workflow_dispatch/ipvm
is below
- Closing the loop. Fork and exec via Distributed Compute, oras.land, VCS events ActivityPub over DWN (CLI) with DID VC SCITT with DID:KERI backed attested compute rooted identities. Enabling decentralized hardware roots of trust to facilitate decentralized asynchronous supply chains. Open source software development with end to end encrypted grafted chains communication only via attested channels for comms on vuln discovery with responsible disclosure and communication of remediation.
- This is also how we communicate
- Reference entity actions for this use case
- https://intel.github.io/dffml/main/examples/webhook/webhook.html#webhook-dataflow
- next tutorial: kcp -> k8s -> cf push -> webhook service -> dataflow to create activitypub event -> dwn-cli send -> webrtc -> dwn-cli recv ->
alice threats listen activitypub -stdin
->alice shouldi contribute
->alice please contribute
-> soft-serve/github repo pull request -> webhook service
- next tutorial: kcp -> k8s -> cf push -> webhook service -> dataflow to create activitypub event -> dwn-cli send -> webrtc -> dwn-cli recv ->
- https://www.youtube.com/watch?v=TMlC_iAK3Rg&list=PLtzAOVTpO2jYt71umwc-ze6OmwwCIMnLw&t=2064s
- https://www.youtube.com/watch?v=THKMfJpPt8I&list=PLtzAOVTpO2jYt71umwc-ze6OmwwCIMnLw&t=128s
- https://github.com/charmbracelet/soft-serve
- https://intel.github.io/dffml/main/examples/webhook/webhook.html#webhook-dataflow
- https://github.com/soda480/wait-for-message-action
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- scripts: images containers manifest: Build json from dirs #1450
- CI/CD Event Federation https://codeberg.org/forgejo/discussions/issues/12
- We'll be patching renovate and dependabot and integrating
FROM
rebuild chains off the pinning updates as containers are built with the poly repo pull requests, we'll have auto pull requests to pull requests to update the manifests as we promote across
graph TD
subgraph transparency_service[Transparency Service]
transparency_service_pypi_known_good_package[Trust Attestation in-toto style<br>test result for known-good-package]
end
subgraph shouldi[shouldi - OSS Risk Analysis]
subgraph shouldi_pypi[PyPi]
shouldi_pypi_insecure_package[insecure-package]
shouldi_pypi_known_good_package[known-good-package]
end
end
subgraph shouldi[shouldi - OSS Risk Analysis]
subgraph shouldi_pypi[PyPi]
shouldi_pypi_insecure_package[insecure-package]
shouldi_pypi_known_good_package[known-good-package]
end
end
subgraph cache_index[Container with pip download for use with file:// pip index]
subgraph cache_index_pypi[PyPi]
cache_index_pyOpenSSL[pyOpenSSL]
end
end
subgraph fork[Forked Open Source Packages]
subgraph fork_c[C]
fork_OpenSSL[fork - OpenSSL]
end
subgraph fork_python[Python]
fork_pyOpenSSL[fork - pyOpenSSL]
end
fork_OpenSSL -->|Compile, link, embed| fork_pyOpenSSL
end
subgraph cicd[CI/CD]
runner_tool_cache[$RUNNER_TOOL_CACHE]
runner_image[Runner container image - OSDecentrAlice]
subgraph loopback_index_service[Loopback/sidecar package index]
serve_package[Serve Package]
end
subgraph workflow[Python project workflow]
install_dependencies[Install Dependencies]
install_dependencies -->|Deps from N-1 2nd<br>party SBOMs get cached| runner_tool_cache
install_dependencies -->|PIP_INDEX_URL| loopback_index_service
end
runner_tool_cache --> runner_image
end
shouldi_pypi_known_good_package --> transparency_service_pypi_known_good_package
serve_package -->|Check for presence of trust attestation<br>inserted against relavent statement<br>URN of policy engine workflow used| transparency_service_pypi_known_good_package
cache_index_pypi -->|Populate $RUNNER_TOOL_CACHE<br>from cached index| runner_image
fork_pyOpenSSL -->|Publish| cache_index_pyOpenSSL
…r eventing across pull requests in poly repo env Related: #1061
|
||
Repo locks | ||
|
||
for this what amounts to essentially a Poly repo structure to work we with the way that we're validating all of our poor requests against each other before merge we need to ensure that when the original PR is merged all the rest of the PR's associated with it that might you know fix API breaking changes in downstream dependent packages are also merged therefore we will need some sort of a system account or bot to which has which must approve every pull request and that bot we can make the logic so that if there is if an approved reviewer has approved the pull request then the bot will approve the pull request analyst initiate the locking procedure and rebate support request into the into the repo so when we have a change which effects more than one repo we will we will trigger rebase is into the respective repos main branches while all of those repos are locked in fact all of the reports will be locked within that within the main repo and the 2nd party org this is because we need to ensure that all of the changes get merged and there are no conflicts so that we end up in an unknown state which which would result in us ending up in an unknown state our state is known so long as we have tested all of the PR's involved against the main branch I or the you know the latest commit before rebase. When all PR's in a set across repos are approved the bot will merge starting with the farthest downstream PR at it will specify somehow version information to the CIA so that the C I can block waiting for the commit which was in the original PR to be merged before continuing this will ensure that the CI jobs do not run against a slightly outdated version of the original the repo which the original PR was made against |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should says CI not CIA, I think this was speech to text my wrist was sprained
Here is the start on the CI changes for the main package: https://gist.github.com/pdxjohnny/ae1430367040298b3a895abc4d2863ee