Skip to content

Commit

Permalink
Merge pull request #14 from wack/robbie/const-generic
Browse files Browse the repository at this point in the history
Introduce Const Generic for Categorical Data
  • Loading branch information
RobbieMcKinstry authored Nov 2, 2024
2 parents e32e4d9 + 843b32f commit c1b3cf1
Show file tree
Hide file tree
Showing 4 changed files with 63 additions and 3 deletions.
14 changes: 13 additions & 1 deletion src/metrics/mod.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use crate::stats::EnumerableCategory;
use crate::stats::{Categorical, EnumerableCategory};
use std::fmt;

/// [ResponseStatusCode] groups HTTP response status codes according
Expand All @@ -18,6 +18,18 @@ pub enum ResponseStatusCode {
_5XX,
}

impl Categorical<5> for ResponseStatusCode {
fn category(&self) -> usize {
match self {
Self::_1XX => 0,
Self::_2XX => 1,
Self::_3XX => 2,
Self::_4XX => 3,
Self::_5XX => 4,
}
}
}

impl fmt::Display for ResponseStatusCode {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Expand Down
43 changes: 43 additions & 0 deletions src/stats/categorical.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
/// Data is [Categorical] if each element has a surjective mapping to a number
/// from `[0, N)`. An `[Categorical]` trait expresses data that fits into exactly one
/// of `N` categories (or bins). The value of `N` represents the total (i.e. the max)
/// number of categories.
/// For example, if modeling bools, the groups are `True` and `False`, so N=2.
/// If modeling a six sided die, the groups would be 0 through 5, so N=6.
/// Each instance must be able to report which category it belongs to (using Self::category method).
/// Categories are zero-indexed (the first category is represented by `0usize`).
/// You can think of an [EnumerableCategory] as a hashmap with fixed integer keys. When the map is
/// created, its keys must already be known and completely cover the range `[0, N)`.
///
/// ```rust
/// use std::collections::HashSet;
/// use canary::stats::Categorical;
///
/// #[derive(PartialEq, Eq, Debug, Hash)]
/// enum Coin {
/// Heads,
/// Tails,
/// }
///
/// impl Categorical<2> for Coin {
/// fn category(&self) -> usize {
/// match self {
/// Self::Heads => 0,
/// Self::Tails => 1,
/// }
/// }
/// }
/// ```
pub trait Categorical<const N: usize> {
fn category(&self) -> usize;
}

#[cfg(test)]
mod tests {
use static_assertions::assert_obj_safe;

use super::Categorical;

// The categorical trait must be object-save.
assert_obj_safe!(Categorical<5>);
}
6 changes: 4 additions & 2 deletions src/stats/chi.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,15 @@ fn degrees_of_freedom<Cat: EnumerableCategory>(table: &impl ContingencyTable<Cat
/// created, its keys must already be known and initialized with zero values.
///
/// ```rust
/// use std::collections::HashSet;
/// use canary::stats::EnumerableCategory;
///
/// #[derive(PartialEq, Eq, Debug, Hash)]
/// enum Coin {
/// Heads,
/// Tails,
/// }
/// use std::collections::HashSet;
/// use canary::stats::EnumerableCategory;
///
/// impl EnumerableCategory for Coin {
/// fn groups() -> Box<dyn Iterator<Item = Self>> {
/// Box::new([Coin::Heads, Coin::Tails].into_iter())
Expand Down
3 changes: 3 additions & 0 deletions src/stats/mod.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
use std::collections::HashMap;

pub use categorical::Categorical;
pub use chi::EnumerableCategory;
pub use group::Group;
pub use observation::{CategoricalObservation, Observation};
Expand Down Expand Up @@ -99,6 +100,8 @@ impl ChiSquareEngine {
/// This type maps the dependent variable to its count.
type Table = HashMap<ResponseStatusCode, usize>;

/// For modeling categorical data.
mod categorical;
/// contains the engine to calculate the chi square test statistic.
mod chi;
/// `group` defines the two groups.
Expand Down

0 comments on commit c1b3cf1

Please sign in to comment.