Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EKO Rust reader v2 #400

Merged
merged 46 commits into from
Aug 22, 2024
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
7a10625
rust: Restart dekoder
felixhekhorn Aug 12, 2024
e2e5e9f
rust: Start reading ops in output
felixhekhorn Aug 15, 2024
5a8b7ef
rust: Start reading files
felixhekhorn Aug 16, 2024
b3c3682
rust: Start reading operators
felixhekhorn Aug 16, 2024
19dcd27
rust: Read ops as ref
felixhekhorn Aug 19, 2024
2b96396
rust: Introduce thiserror
felixhekhorn Aug 19, 2024
fd85059
rust: Remove 2 more unwraps
felixhekhorn Aug 19, 2024
a297b9e
rust: Improve op loading
felixhekhorn Aug 19, 2024
9b0e1df
rust: Remove all unwraps
felixhekhorn Aug 19, 2024
edd0915
rust: Trade Box for Sized
felixhekhorn Aug 19, 2024
335115f
rust: Drop pyproject.toml for dekoder
felixhekhorn Aug 19, 2024
44fc971
rust: Introduce assert_fs crate
felixhekhorn Aug 20, 2024
ea6fd0d
rust: Adjust EKO::close
felixhekhorn Aug 20, 2024
4d37d72
rust: Improve Operator handling
felixhekhorn Aug 20, 2024
bf9781a
rust: Drop temporary test dir
felixhekhorn Aug 20, 2024
36dbf7c
rust: Add EKO::close test
felixhekhorn Aug 20, 2024
a45801e
rust: Add more EKO::check
felixhekhorn Aug 20, 2024
814e256
rust: Fix many clippy warnings
felixhekhorn Aug 20, 2024
d9fe346
rust: Raise MSRV to 1.70.0
felixhekhorn Aug 20, 2024
bb9edd1
Add cargo clippy to pre-commit
felixhekhorn Aug 20, 2024
d888c33
rust: Use std instead of hashbrown
felixhekhorn Aug 20, 2024
f0d6a27
rust: Incomplete attempt to use TryFrom
felixhekhorn Aug 20, 2024
599fb85
rust: Specify error in TryFrom
felixhekhorn Aug 20, 2024
1a696a1
rust: Allow only Operator as value in Inventory
felixhekhorn Aug 20, 2024
8d0d759
rust: Remove file properties from EKO
felixhekhorn Aug 20, 2024
6c7c470
rust: Simplify Inventory::keys()
felixhekhorn Aug 21, 2024
5ca8848
rust: Chain key finding calls
felixhekhorn Aug 21, 2024
dff7581
rust: Chain operator loading calls
felixhekhorn Aug 21, 2024
787ca42
rust: Remove BufReader from operator loading
felixhekhorn Aug 21, 2024
1656cfc
rust: Drop Operator::new()
felixhekhorn Aug 21, 2024
feca19a
rust: Inline path_exists
felixhekhorn Aug 21, 2024
410dcfa
rust: Rename EKO::check to assert_working_dir
felixhekhorn Aug 21, 2024
8eb8775
rust: Simplify some returns
felixhekhorn Aug 21, 2024
229670a
rust: Make tar writer capacity a const
felixhekhorn Aug 21, 2024
c6784d9
rust: Drop glob for read_dir
felixhekhorn Aug 21, 2024
5be3f18
rust: Use Path::exists
felixhekhorn Aug 21, 2024
e283553
rust: Load operator via return and not reference
felixhekhorn Aug 21, 2024
c81bb08
rust: Drop float_cmp for manual check
felixhekhorn Aug 21, 2024
a93c14c
rust: Split EKO to separate module
felixhekhorn Aug 21, 2024
d2b7ea4
rust: Attempt to move test data to server
felixhekhorn Aug 21, 2024
77e6d15
rust: Fix doc abbrev
felixhekhorn Jul 18, 2024
5fe868c
Fix poe rdocs
felixhekhorn Aug 21, 2024
234a5e5
rust: Improve docs
felixhekhorn Aug 21, 2024
1177a57
rust: Load errors
felixhekhorn Aug 21, 2024
d394978
rust: Swap eq for ==
felixhekhorn Aug 22, 2024
bf132ea
rust: Make Inventory more autonomous
felixhekhorn Aug 22, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
543 changes: 543 additions & 0 deletions Cargo.lock

Large diffs are not rendered by default.

27 changes: 27 additions & 0 deletions crates/dekoder/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
[package]
name = "dekoder"

authors.workspace = true
description.workspace = true
readme.workspace = true
categories.workspace = true
edition.workspace = true
keywords.workspace = true
license.workspace = true
repository.workspace = true
rust-version.workspace = true
version.workspace = true

[package.metadata.docs.rs]
rustdoc-args = ["--html-in-header", "doc-header.html"]

[dependencies]
tar = "0.4.41"
hashbrown = "0.14"
glob = "0.3.1"
float-cmp = "0.9.0"
felixhekhorn marked this conversation as resolved.
Show resolved Hide resolved
yaml-rust2 = "0.8"
lz4_flex = "0.9.2"
ndarray = "0.15.4"
ndarray-npy = "0.8.1"
thiserror = "1.0.63"
1 change: 1 addition & 0 deletions crates/dekoder/doc-header.html
16 changes: 16 additions & 0 deletions crates/dekoder/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
[build-system]
requires = ["maturin>=1.1,<2.0"]
build-backend = "maturin"

[project]
name = "dekoder"
requires-python = ">=3.9"
classifiers = [
"Programming Language :: Rust",
"Programming Language :: Python :: Implementation :: CPython",
"Programming Language :: Python :: Implementation :: PyPy",
]
dependencies = ["cffi"]

[tool.maturin]
bindings = "cffi"
87 changes: 87 additions & 0 deletions crates/dekoder/src/inventory.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
//! Assets manager.
use glob::glob;
use hashbrown::HashMap;
felixhekhorn marked this conversation as resolved.
Show resolved Hide resolved
use std::ffi::OsString;
use std::fs::read_to_string;
use std::path::PathBuf;
use yaml_rust2::{Yaml, YamlLoader};

use crate::{EKOError, Result};

/// Headers are in yaml files.
const HEADER_EXT: &'static str = "*.yaml";

/// Header type in an inventory.
pub(crate) trait HeaderT {
felixhekhorn marked this conversation as resolved.
Show resolved Hide resolved
/// Load from yaml.
fn load_from_yaml(yml: &Yaml) -> Result<Self>
where
Self: Sized;
felixhekhorn marked this conversation as resolved.
Show resolved Hide resolved
/// Comparator.
fn eq(&self, other: &Self, ulps: i64) -> bool;
}

/// Value type in an inventory.
pub(crate) trait ValueT {
felixhekhorn marked this conversation as resolved.
Show resolved Hide resolved
// File suffix (instead of header suffix)
const FILE_SUFFIX: &'static str;
/// Load from file.
fn load_from_path(&mut self, p: PathBuf) -> Result<()>;
}

/// Assets manager.
pub(crate) struct Inventory<K: HeaderT> {
/// Working directory
pub(crate) path: PathBuf,
felixhekhorn marked this conversation as resolved.
Show resolved Hide resolved
/// Available keys
pub(crate) keys: HashMap<OsString, K>,
felixhekhorn marked this conversation as resolved.
Show resolved Hide resolved
}

impl<K: HeaderT> Inventory<K> {
/// Load all available entries.
pub fn load_keys(&mut self) -> Result<()> {
let path = self.path.join(&HEADER_EXT);
let path = path
.to_str()
.ok_or(EKOError::KeyError("due to invalid path".to_owned()))?;
for entry in glob(path)
felixhekhorn marked this conversation as resolved.
Show resolved Hide resolved
.map_err(|_| EKOError::KeyError("because failed to read glob pattern".to_owned()))?
.filter_map(core::result::Result::ok)
{
let cnt = YamlLoader::load_from_str(&read_to_string(&entry)?)
.map_err(|_| EKOError::KeyError("because failed to read yaml file.".to_owned()))?;
self.keys.insert(
entry
.file_name()
.ok_or(EKOError::KeyError(
"because failed to read file name".to_owned(),
))?
.to_os_string(),
K::load_from_yaml(&cnt[0])?,
);
}
Ok(())
}

/// List available keys.
pub fn keys(&self) -> Vec<&K> {
let mut ks = Vec::new();
for k in self.keys.values() {
ks.push(k);
}
ks
}
felixhekhorn marked this conversation as resolved.
Show resolved Hide resolved

/// Check if `k` is available (with given precision).
pub fn has(&self, k: &K, ulps: i64) -> bool {
self.keys.iter().find(|it| (it.1).eq(&k, ulps)).is_some()
}

/// Load `k` from disk.
pub fn load<V: ValueT>(&mut self, k: &K, ulps: i64, v: &mut V) -> Result<()> {
let k = self.keys.iter().find(|it| (it.1).eq(&k, ulps));
let k = k.ok_or(EKOError::KeyError("because it was not found".to_owned()))?;
let path = self.path.join(k.0).with_extension(V::FILE_SUFFIX);
v.load_from_path(path)
}
}
234 changes: 234 additions & 0 deletions crates/dekoder/src/lib.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,234 @@
//! eko output interface.
use float_cmp::approx_eq;
use hashbrown::HashMap;
use lz4_flex::frame::FrameDecoder;
use ndarray::Array4;
use ndarray_npy::NpzReader;
use std::fs::remove_dir_all;
use std::fs::File;
use std::io::{BufReader, BufWriter, Cursor};
use std::path::PathBuf;
use thiserror::Error;
use yaml_rust2::Yaml;

mod inventory;

use crate::inventory::HeaderT;

/// The EKO errors.
#[derive(Error, Debug)]
pub enum EKOError {
#[error("No working directory")]
NoWorkingDir,
#[error("I/O error")]
IOError(#[from] std::io::Error),
#[error("No target path given")]
NoTargetPath,
#[error("Target path `{0}` already exists")]
TargetAlreadyExists(PathBuf),
#[error("Loading operator from `{0}` failed")]
OperatorLoadError(PathBuf),
#[error("Failed to read key(s) `{0}`")]
KeyError(String),
}

/// My result type has always my errros.
type Result<T> = std::result::Result<T, EKOError>;

/// A reference point in the evolution atlas.
pub struct EvolutionPoint {
/// Evolution scale.
pub scale: f64,
/// Number of flavors
pub nf: i64,
}

impl inventory::HeaderT for EvolutionPoint {
/// Load from yaml.
fn load_from_yaml(yml: &Yaml) -> Result<Self> {
// work around float representation
let scale = yml["scale"].as_f64();
let scale = if scale.is_some() {
scale.ok_or(EKOError::KeyError(
"because failed to read scale as float".to_owned(),
))?
} else {
yml["scale"].as_i64().ok_or(EKOError::KeyError(
"because failed to read scale as float from int".to_owned(),
))? as f64
};
let nf = yml["nf"]
.as_i64()
.ok_or(EKOError::KeyError("because failed to read nf".to_owned()))?;
Ok(Self { scale, nf })
}

/// (Protected) comparator.
fn eq(&self, other: &Self, ulps: i64) -> bool {
self.nf == other.nf && approx_eq!(f64, self.scale, other.scale, ulps = ulps)
}
}

impl EvolutionPoint {
/// Comparator.
pub fn equals(&self, other: &Self, ulps: i64) -> bool {
self.eq(other, ulps)
}
}

/// 4D evolution operator.
pub struct Operator {
pub op: Array4<f64>,
}

impl inventory::ValueT for Operator {
const FILE_SUFFIX: &'static str = "npz.lz4";
fn load_from_path(&mut self, p: PathBuf) -> Result<()> {
let mut reader =
BufReader::new(FrameDecoder::new(BufReader::new(File::open(p.to_owned())?)));
let mut buffer = Vec::new();
std::io::copy(&mut reader, &mut buffer)?;
let mut npz = NpzReader::new(Cursor::new(buffer))
.map_err(|_| EKOError::OperatorLoadError(p.to_owned()))?;
let operator: Array4<f64> = npz
.by_name("operator.npy")
.map_err(|_| EKOError::OperatorLoadError(p.to_owned()))?;
self.op = operator;
Ok(())
}
}

impl Operator {
/// Empty initializer.
pub fn zeros() -> Self {
Self {
op: Array4::zeros((0, 0, 0, 0)),
}
}
}

/// EKO output
pub struct EKO {
/// Working directory
path: PathBuf,
/// Associated archive path
tar_path: Option<PathBuf>,
/// allow content modifications?
read_only: bool,
felixhekhorn marked this conversation as resolved.
Show resolved Hide resolved
/// final operators
operators: inventory::Inventory<EvolutionPoint>,
}

/// Operators directory.
const DIR_OPERATORS: &'static str = "operators/";

impl EKO {
/// Check our working directory is safe.
fn check(&self) -> Result<()> {
let path_exists = self.path.try_exists().is_ok_and(|x| x);
felixhekhorn marked this conversation as resolved.
Show resolved Hide resolved
if !path_exists {
felixhekhorn marked this conversation as resolved.
Show resolved Hide resolved
return Err(EKOError::NoWorkingDir);
}
Ok(())
}
felixhekhorn marked this conversation as resolved.
Show resolved Hide resolved

/// Remove the working directory.
fn destroy(&self) -> Result<()> {
self.check()?;
Ok(remove_dir_all(self.path.to_owned())?)
}

/// Write content back to an archive and destroy working directory.
pub fn close(&self, allow_overwrite: bool) -> Result<()> {
self.write(allow_overwrite, true)
}

/// Write content back to an archive.
pub fn write(&self, allow_overwrite: bool, destroy: bool) -> Result<()> {
self.check()?;
// in read-only there is nothing to do then to destroy, since we couldn't
if self.read_only && destroy {
return self.destroy();
}
// check we can write
if self.tar_path.is_none() {
return Err(EKOError::NoTargetPath);
}
let dst = self.tar_path.to_owned().ok_or(EKOError::NoTargetPath)?;
let dst_exists = dst.try_exists().is_ok_and(|x| x);
if !allow_overwrite && dst_exists {
return Err(EKOError::TargetAlreadyExists(dst));
}
// create writer
let dst_file = File::create(dst.to_owned())?;
let dst_file = BufWriter::with_capacity(128 * 1024, dst_file);
felixhekhorn marked this conversation as resolved.
Show resolved Hide resolved
let mut ar = tar::Builder::new(dst_file);
// do it!
ar.append_dir_all(".", self.path.to_owned())?;
// cleanup
if destroy {
self.destroy()?;
}
Ok(())
}

/// Set the archive path.
pub fn set_tar_path(&mut self, tar_path: PathBuf) {
self.tar_path = Some(tar_path.to_owned());
}

/// Open tar from `src` to `dst` for reading.
pub fn read(src: PathBuf, dst: PathBuf) -> Result<Self> {
Self::extract(src, dst, true)
}

/// Open tar from `src` to `dst` for editing.
pub fn edit(src: PathBuf, dst: PathBuf) -> Result<Self> {
Self::extract(src, dst, false)
}

/// Extract tar file from `src` to `dst`.
pub fn extract(src: PathBuf, dst: PathBuf, read_only: bool) -> Result<Self> {
let mut ar = tar::Archive::new(File::open(src.to_owned())?);
ar.unpack(dst.to_owned())?;
let mut obj = Self::load_opened(dst, read_only)?;
obj.set_tar_path(src);
Ok(obj)
}

/// Load an EKO from a directory `path` (instead of tar).
pub fn load_opened(path: PathBuf, read_only: bool) -> Result<Self> {
let mut operators = inventory::Inventory {
path: path.join(DIR_OPERATORS),
keys: HashMap::new(),
};
operators.load_keys()?;
Ok(Self {
path,
tar_path: None,
read_only,
operators,
})
}

/// List available evolution points.
pub fn available_operators(&self) -> Vec<&EvolutionPoint> {
self.operators.keys()
}

/// Check if the operator at the evolution point `ep` is available.
pub fn has_operator(&self, ep: &EvolutionPoint, ulps: i64) -> bool {
self.operators.has(ep, ulps)
}

/// Load the operator at the evolution point `ep` from disk.
pub fn load_operator(
&mut self,
ep: &EvolutionPoint,
ulps: i64,
op: &mut Operator,
) -> Result<()> {
self.operators.load(ep, ulps, op)?;
Ok(())
}
}
Binary file added crates/dekoder/tests/data/v0.15.tar
felixhekhorn marked this conversation as resolved.
Show resolved Hide resolved
Binary file not shown.
1 change: 1 addition & 0 deletions crates/dekoder/tests/target/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*
Loading
Loading