The Python wrapper for Rust light-curve-feature
and light-curve-dmdt
packages which gives a collection of
high-performant time-series feature extractors.
python3 -mpip install 'light-curve[full]'
full
extras would install the package with all optional Python dependencies required by experimental
features.
We also provide light-curve-python
package which is just an "alias" to the main light-curve[full]
package.
Minimum supported Python version is 3.9. We provide binary CPython wheels via PyPi for a number of platforms and architectures. We also provide binary wheels for stable CPython ABI, so the package is guaranteed to work with all future CPython3 versions.
Arch \ OS | Linux glibc 2.17+ | Linux musl 1.1+ | macOS | Windows #186 |
---|---|---|---|---|
x86-64 | wheel (MKL) | wheel (MKL) | wheel 13+ | wheel (no Ceres, no GSL) |
i686 | src | src | — | not tested |
aarch64 | wheel | wheel | wheel 14+ | not tested |
ppc64le | wheel | not tested (no Rust toolchain) | — | — |
- "wheel": binary wheel is available on pypi.org, local building is not required for the platform, the only
pre-requirement is a recent
pip
version. For Linux x86-64 we provide binary wheels built with Intel MKL for better periodogram performance, which is not a default build option. For Windows x86-64 we provide wheel with no Ceres and no GSL support, which is not a default build option. - "src": the package is confirmed to be built and pass unit tests locally, but testing and package building is not supported by CI. See ["Build from source"] section bellow for the details.
- "not tested": building from the source code is not tested, please report us building status via issue/PR/email.
macOS wheels require relatively new OS versions, please open an issue if you require support of older Macs, see #376 for the details.
We stopped publishing PyPy wheels (#345), please feel free to open an issue if you need them.
See bellow for the details on how to build the package from the source code.
Most of the classes implement various feature evaluators useful for light-curve based astrophysical source classification and characterisation.
import light_curve as lc
import numpy as np
# Time values can be non-evenly separated but must be an ascending array
n = 101
t = np.linspace(0.0, 1.0, n)
perfect_m = 1e3 * t + 1e2
err = np.sqrt(perfect_m)
m = perfect_m + np.random.normal(0, err)
# Half-amplitude of magnitude
amplitude = lc.Amplitude()
# Fraction of points beyond standard deviations from mean
beyond_std = lc.BeyondNStd(nstd=1)
# Slope, its error and reduced chi^2 of linear fit
linear_fit = lc.LinearFit()
# Feature extractor, it will evaluate all features in more efficient way
extractor = lc.Extractor(amplitude, beyond_std, linear_fit)
# Array with all 5 extracted features
result = extractor(t, m, err, sorted=True, check=False)
print('\n'.join(f"{name} = {value:.2f}" for name, value in zip(extractor.names, result)))
# Run in parallel for multiple light curves:
results = amplitude.many(
[(t[:i], m[:i], err[:i]) for i in range(n // 2, n)],
n_jobs=-1,
sorted=True,
check=False,
)
print("Amplitude of amplitude is {:.2f}".format(np.ptp(results)))
If you're confident in your inputs you could use sorted = True
(t
is in ascending order)
and check = False
(no NaNs in inputs, no infs in t
or m
) for better performance.
Note that if your inputs are not valid and are not validated by
sorted=None
and check=True
(default values) then all kind of bad things could happen.
Print feature classes list
import light_curve as lc
print([x for x in dir(lc) if hasattr(getattr(lc, x), "names")])
Read feature docs
import light_curve as lc
help(lc.BazinFit)
See the complete list of available feature evaluators and documentation
in
light-curve-feature
Rust crate docs.
Italic names are experimental features.
While we usually say "magnitude" and use "m" as a time-series value, some of the features are supposed to be
used with
flux light-curves.
The last column indicates whether the feature should be used with flux light curves only, magnitude light
curves only,
or any kind of light curves.
Feature name | Description | Min data points | Features number | Flux/magnitude |
---|---|---|---|---|
Amplitude | Half amplitude of magnitude: |
1 | 1 | Flux or magn |
AndersonDarlingNormal | Unbiased Anderson–Darling normality test statistic:
where |
4 | 1 | Flux or magn |
BazinFit | Five fit parameters and goodness of fit (reduced |
6 | 1 | Flux only |
BeyondNStd | Fraction of observations beyond |
2 | 1 | Flux or magn |
ColorOfMedian (experimental) |
Magnitude difference between medians of two bands | 2 | 1 | Magn only |
Cusum | A range of cumulative sums:
|
2 | 1 | Flux or magn |
Eta | Von Neummann |
2 | 1 | Flux or magn |
EtaE | Modernisation of Eta for unevenly time series:
|
2 | 1 | Flux or magn |
ExcessVariance | Measure of the variability amplitude:
|
2 | 1 | Flux only |
FluxNNotDetBeforeFd (experimental) |
Number of non-detections before the first detection | 2 | 1 | Flux only |
InterPercentileRange |
|
1 | 1 | Flux or magn |
Kurtosis | Excess kurtosis of magnitude:
|
4 | 1 | Flux or magn |
LinearFit | The slope, its error and reduced |
3 | 3 | Flux or magn |
LinearTrend | The slope and its error of the light curve in the linear fit of a magnitude light curve without respect to the observation error |
2 | 2 | Flux or magn |
MagnitudeNNotDetBeforeFd (experimental) |
Number of non-detections before the first detection | 2 | 1 | Magn only |
MagnitudePercentageRatio | Magnitude percentage ratio:
|
1 | 1 | Flux or magn |
MaximumSlope | Maximum slope between two sub-sequential observations:
|
2 | 1 | Flux or magn |
Mean | Mean magnitude:
|
1 | 1 | Flux or magn |
MeanVariance | Standard deviation to mean ratio:
|
2 | 1 | Flux only |
Median | Median magnitude | 1 | 1 | Flux or magn |
MedianAbsoluteDeviation | Median of the absolute value of the difference between magnitude and its median:
|
1 | 1 | Flux or magn |
MedianBufferRangePercentage | Fraction of points within |
1 | 1 | Flux or magn |
OtsuSplit | Difference of subset means, standard deviation of the lower subset, standard deviation of the upper
subset and lower-to-all observation count ratio for two subsets of magnitudes obtained by Otsu's method split.
Otsu's method is used to perform automatic thresholding. The algorithm returns a single threshold that separate values into two classes. This threshold is determined by minimizing intra-class intensity variance |
2 | 4 | Flux or magn |
PercentAmplitude | Maximum deviation of magnitude from its median:
|
1 | 1 | Flux or magn |
PercentDifferenceMagnitudePercentile | Ratio of |
1 | 1 | Flux only |
RainbowFit (experimental) |
Seven fit parameters and goodness of fit (reduced |
6 | 1 | Flux only |
ReducedChi2 | Reduced |
2 | 1 | Flux or magn |
Roms (Experimental) |
Robust median statistic: |
2 | 1 | Flux or magn |
Skew | Skewness of magnitude:
|
3 | 1 | Flux or magn |
StandardDeviation | Standard deviation of magnitude:
|
2 | 1 | Flux or magn |
StetsonK |
Stetson K coefficient described light curve shape:
|
2 | 1 | Flux or magn |
VillarFit | Seven fit parameters and goodness of fit (reduced |
8 | 8 | Flux only |
WeightedMean | Weighted mean magnitude:
|
1 | 1 | Flux or magn |
Meta-features can accept other feature extractors and apply them to pre-processed data.
This feature transforms time-series data into the Lomb-Scargle periodogram, providing an estimation of the power spectrum. The peaks argument corresponds to the number of the most significant spectral density peaks to return. For each peak, its period and "signal-to-noise" ratio are returned.
The optional features argument accepts a list of additional feature evaluators, which are applied to the power spectrum: frequency is passed as "time," power spectrum is passed as "magnitude," and no uncertainties are set.
Binning time series to bins with width
Binned time series is defined by
As of v0.8, experimental extractors (see below), support multi-band light-curve inputs.
import numpy as np
from light_curve.light_curve_py import LinearFit
t = np.arange(20, dtype=float)
m = np.arange(20, dtype=float)
sigma = np.full_like(t, 0.1)
bands = np.array(["g"] * 10 + ["r"] * 10)
feature = LinearFit(bands=["g", "r"])
values = feature(t, m, sigma, bands)
print(values)
Rainbow (Russeil+23) is a black-body parametric model for transient light
curves.
By default, it uses Bazin function as a model for bolometric flux evolution and a logistic function for the
temperature
evolution.
The user may customize the model by providing their own functions for bolometric flux and temperature
evolution.
This example demonstrates the reconstruction of a synthetic light curve with this model.
RainbowFit
requires iminuit
package.
import numpy as np
from light_curve.light_curve_py import RainbowFit
def bb_nu(wave_aa, T):
"""Black-body spectral model"""
nu = 3e10 / (wave_aa * 1e-8)
return 2 * 6.626e-27 * nu ** 3 / 3e10 ** 2 / np.expm1(6.626e-27 * nu / (1.38e-16 * T))
# Effective wavelengths in Angstrom
band_wave_aa = {"g": 4770.0, "r": 6231.0, "i": 7625.0, "z": 9134.0}
# Parameter values
reference_time = 60000.0 # time close to the peak time
# Bolometric flux model parameters
amplitude = 1.0 # bolometric flux semiamplitude, arbitrary (non-spectral) flux/luminosity units
rise_time = 5.0 # exponential growth timescale, days
fall_time = 30.0 # exponential decay timescale, days
# Temperature model parameters
Tmin = 5e3 # temperature on +infinite time, kelvins
delta_T = 10e3 # (Tmin + delta_T) is temperature on -infinite time, kelvins
k_sig = 4.0 # temperature evolution timescale, days
rng = np.random.default_rng(0)
t = np.sort(rng.uniform(reference_time - 3 * rise_time, reference_time + 3 * fall_time, 1000))
band = rng.choice(list(band_wave_aa), size=len(t))
waves = np.array([band_wave_aa[b] for b in band])
# Temperature evolution is a sigmoid function
temp = Tmin + delta_T / (1.0 + np.exp((t - reference_time) / k_sig))
# Bolometric flux evolution is the Bazin function
lum = amplitude * np.exp(-(t - reference_time) / fall_time) / (
1.0 + np.exp(-(t - reference_time) / rise_time))
# Spectral flux density for each given pair of time and passband
flux = np.pi * bb_nu(waves, temp) / (5.67e-5 * temp ** 4) * lum
# S/N = 5 for minimum flux, scale for Poisson noise
flux_err = np.sqrt(flux * np.min(flux) / 5.0)
flux += rng.normal(0.0, flux_err)
feature = RainbowFit.from_angstrom(band_wave_aa, with_baseline=False)
values = feature(t, flux, sigma=flux_err, band=band)
print(dict(zip(feature.names, values)))
print(f"Goodness of fit: {values[-1]}")
Note, that while we don't use precise physical constant values to generate the data, RainbowFit
uses CODATA
2018
values.
From the technical point of view the package consists of two parts: a wrapper
for light-curve-feature
Rust crate (light_curve_ext
sub-package) and
pure Python sub-package light_curve_py
.
We use the Python implementation of feature extractors to test Rust implementation and to implement new
experimental
extractors.
Please note, that the Python implementation is much slower for most of the extractors and doesn't provide the
same
functionality as the Rust implementation.
However, the Python implementation provides some new feature extractors you can find useful.
You can manually use extractors from both implementations:
import numpy as np
from numpy.testing import assert_allclose
from light_curve.light_curve_ext import LinearTrend as RustLinearTrend
from light_curve.light_curve_py import LinearTrend as PythonLinearTrend
rust_fe = RustLinearTrend()
py_fe = PythonLinearTrend()
n = 100
t = np.sort(np.random.normal(size=n))
m = 3.14 * t - 2.16 + np.random.normal(size=n)
assert_allclose(rust_fe(t, m), py_fe(t, m),
err_msg="Python and Rust implementations must provide the same result")
This should print a warning about experimental status of the Python class
You can run all benchmarks from the Python project folder
with python3 -mpytest --benchmark-enable tests/test_w_bench.py
, or with slow benchmarks
disabled python3 -mpytest -m "not (nobs or multi)" --benchmark-enable tests/test_w_bench.py
.
Here we benchmark the Rust implementation (rust
) versus feets
package and
our own Python implementation (lc_py
) for a light curve having n=1000 observations.
The plot shows that the Rust implementation of the package outperforms other ones by a factor of 1.5—50.
This allows to extract a large set of "cheap" features well under one ms for n=1000.
The performance of parametric fits (BazinFit
and VillarFit
) and Periodogram
depend on their parameters,
but the
typical timescale of feature extraction including these features is 20—50 ms for few hundred observations.
Benchmark results of several features for both the pure-Python and Rust implementations of the "light-curve" package, as a function of the number of observations in a light curve. Both the x-axis and y-axis are on a logarithmic scale.
Processing time per a single light curve for extraction of features subset presented in first benchmark versus the number of CPU cores used. The dataset consists of 10,000 light curves with 1,000 observations in each.
See benchmarks' descriptions in more details in "Performant feature extraction for photometric time series".
Class DmDt
provides dm–dt mapper (based
on Mahabal et al. 2011, Soraisam et al. 2020).
It is a Python wrapper for light-curve-dmdt
Rust crate.
import numpy as np
from light_curve import DmDt
from numpy.testing import assert_array_equal
dmdt = DmDt.from_borders(min_lgdt=0, max_lgdt=np.log10(3), max_abs_dm=3, lgdt_size=2, dm_size=4, norm=[])
t = np.array([0, 1, 2], dtype=np.float32)
m = np.array([0, 1, 2], dtype=np.float32)
desired = np.array(
[
[0, 0, 2, 0],
[0, 0, 0, 1],
]
)
actual = dmdt.points(t, m)
assert_array_equal(actual, desired)
Install Rust toolchain and Python 3.9+.
It is recommended to use rustup
to install Rust toolchain and update it with
rustup update
periodically.
Clone the code, create and activate a virtual environment.
git clone https://github.com/light-curve/light-curve-python.git
cd light-curve-python/light-curve
python3 -m venv venv
source venv/bin/activate
Install the package in editable mode (see more details about building from source bellow).
python -mpip install maturin
# --release would take longer, but the package would be faster
# Put other Cargo flags if needed, e.g. --no-default-features --features=fftw-source,ceres-source
maturin develop --extras=dev
Next time you can just run source venv/bin/activate
to activate the environment and maturin develop
to
rebuild
Rust code if changed.
You don't need to reinstall the package if you change Python code.
You also don't need to add --extras=dev
next time, it is needed only to install development dependencies.
You are also encouraged to install pre-commit
hooks to keep the codebase clean.
You can get it with pip
(see the documentation for other ways), and then
install
the hooks with
pre-commit install
All test-related dependencies are installed with --extras=dev
flag, so you don't need to install anything
else.
You can run tests with pytest
:
python -mpytest
Benchmarks are disabled by default, you can enable them with --benchmark-enable
flag:
python -mpytest --benchmark-enable
See Benchamrks section for more details.
The package has a number of compile-time features, mostly to control which C/C++ dependencies are used.
The list of these Cargo features may be passed to maturin
with --features
flag, it is also
recommended to use --no-default-features
to avoid building unnecessary dependencies.
The following features are available:
abi3
(default) - enables CPython ABI3 compatibility, turn it off for other interpreters or if you believe that code would be faster without it (our benchmarks show that it is not the case).ceres-source
(default) - enables Ceres solver support, and builds it from sources. You need C++ compiler and cmake available on your system. Known to not work on Windows. It is used as an optional optimization alrotithm forBazinFit
andVillarFit
.ceres-system
- enables Ceres solver support, but links with a dynamic library. You need to have a compatible version of Ceres installed on your system.fftw-source
(default) - enables FFTW support, and builds it from sources. You need C compiler available on your system. Note that at least one offftw-*
features must be activated.fftw-system
- enables FFTW support, but links with a dynamic library. You need to have a compatible version of FFTW installed on your system.fftw-mkl
- enables FFTW support with Intel MKL backend. Intel MKL will be downloaded automatically during the build. Highly recommended for Intel CPUs to achieve up to 2x faster "fast" periodogram calculation.gsl
(default) - enables GNU scientific library support. You need a compatible version of GSL installed on your system. It is used as an optional optimization algorithm forBazinFit
andVillarFit
.mimalloc
(default) - enables mimalloc memory allocator support. Our benchmarks show up to 2x speedup for some simple features, but it may lead to larger memory consumption.
You can build the package with maturin
(a Python package for building and publishing Rust crates as Python
packages).
This example shows how to build the package with minimal dependencies.
python -mpip install maturin
maturin build --release --locked --no-default-features --features=abi3,fftw-source,mimalloc
Here we use --release
to build the package in release mode (slower build, faster execution), --locked
to
ensure
reproducible builds, --no-default-features
to disable default features, and
--features=abi3,fftw-source,mimalloc
to enable FFTW (builds from vendored sources), ABI3 compatibility, and mimalloc memory allocator.
You can also build the package with build
(a Python package for building and installing Python packages from
source).
python -mpip install build
MATURIN_PEP517_ARGS="--locked --no-default-features --features=abi3,fftw-source,mimalloc" python -m build
ciwbuildwheel
is a project that builds wheels for Python packages on CI servers, we use it to build wheels
with
GitHub Actions.
You can use it locally to build wheels on your platform (change platform identifier to one
from the list of supported:
python -mpip install cibuildwheel
python -m cibuildwheel --only=cp38-manylinux_x86_64
Please notice that we use different Cargo feature set for different platforms, which is defined in
pyproject.toml
.
You can build Windows wheels on Windows, Linux wheels on any platform with Docker installed (Qemu may be
needed for
cross-architecture builds), and macOS wheels on macOS.
On Windows and macOS some additional dependencies will be installed automatically, please check
the cibuildwheel documentation and pyproject.toml
for details.
Also, macOS builds require MACOSX_DEPLOYMENT_TARGET
to be set to the current version of macOS, because
dependent
libraries installed from homebrew
are built with this target:
export MACOSX_DEPLOYMENT_TARGET=$(sw_vers -productVersion | awk -F '.' '{print $1"."0}')
python -m cibuildwheel --only=cp38-macosx_arm64
unset MACOSX_DEPLOYMENT_TARGET
Since we use ABI3 compatibility, you can build wheels for a single Python version (currently 3.9+) and they will work with any later version of CPython.
If you found this project useful for your research please cite Malanchev et al., 2021
@ARTICLE{2021MNRAS.502.5147M,
author = {{Malanchev}, K.~L. and {Pruzhinskaya}, M.~V. and {Korolev}, V.~S. and {Aleo}, P.~D. and {Kornilov}, M.~V. and {Ishida}, E.~E.~O. and {Krushinsky}, V.~V. and {Mondon}, F. and {Sreejith}, S. and {Volnova}, A.~A. and {Belinski}, A.~A. and {Dodin}, A.~V. and {Tatarnikov}, A.~M. and {Zheltoukhov}, S.~G. and {(The SNAD Team)}},
title = "{Anomaly detection in the Zwicky Transient Facility DR3}",
journal = {\mnras},
keywords = {methods: data analysis, astronomical data bases: miscellaneous, stars: variables: general, Astrophysics - Instrumentation and Methods for Astrophysics, Astrophysics - Solar and Stellar Astrophysics},
year = 2021,
month = apr,
volume = {502},
number = {4},
pages = {5147-5175},
doi = {10.1093/mnras/stab316},
archivePrefix = {arXiv},
eprint = {2012.01419},
primaryClass = {astro-ph.IM},
adsurl = {https://ui.adsabs.harvard.edu/abs/2021MNRAS.502.5147M},
adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}