Skip to content
This repository has been archived by the owner on Jun 21, 2022. It is now read-only.

Releases: scikit-hep/awkward-0.x

0.9.0rc1

11 Apr 22:45
Compare
Choose a tag to compare

Deploying release candidate so that uproot continuous integration is not broken.

0.8.15

11 Apr 17:59
d705e6f
Compare
Choose a tag to compare

PRs #117, #118, #120: new JaggedArray.choose and argchoose, as well as generalization of concatenate to ObjectArrays and Tables.

0.8.14

29 Mar 15:35
Compare
Choose a tag to compare

Added to the deserialization whitelist, primarily uproot_methods.classes.* (so that TLorentzVectors and such can be deserialized without manually adding it to the whitelist).

0.8.13

29 Mar 14:49
ad4a82c
Compare
Choose a tag to compare

PR #114: awkward.save and awkward.toparquet now have the same order of arguments: first filename, then the array(s) to save (and awkward.save's order was chosen to be consistent with numpy.save).

0.8.12

25 Mar 20:24
97d5fa7
Compare
Choose a tag to compare

PR #105 fixed two cases of not checking for empty arrays before calling .max().

Added pad and fillna to turn jagged arrays into Numpy arrays:

a = awkward.fromiter([[1.1, 2.2, 3.3, 4.4, 5.5], [], [6.6, 7.7, 8.8], [9.9]])
a.pad(3)
# returns [[1.1 2.2 3.3 4.4 5.5] [None None None] [6.6 7.7 8.8] [9.9 None None]]
a.pad(3, clip=True)
# returns [[1.1 2.2 3.3] [None None None] [6.6 7.7 8.8] [9.9 None None]]
a.pad(3, clip=True).fillna(999)
# returns [[1.1 2.2 3.3] [999.0 999.0 999.0] [6.6 7.7 8.8] [9.9 999.0 999.0]]
a.pad(3, clip=True).fillna(999).regular()
# returns [[  1.1,   2.2,   3.3],
#          [999. , 999. , 999. ],
#          [  9.9, 999. , 999. ]]

0.8.11

11 Mar 16:52
Compare
Choose a tag to compare

0.8.9 and 0.8.10 broke uproot's tree.pandas.df() because that function (illegally!) used the private method _broadcast. This release puts it back as an alias, which will make uproot work as long as the installed version of awkward isn't in this two-version window.

This will be handled properly soon.

0.8.10

10 Mar 08:41
eb7ff85
Compare
Choose a tag to compare

PR #101: minor bug-fixes on version 0.8.9.

0.8.9

09 Mar 21:28
c7c75d5
Compare
Choose a tag to compare

Various bug-fixes and improvements to broadcasting from PR #99.

The old internal member function _broadcast has been made part of the public API as tojagged. (Do not confuse this with the internal member function _tojagged, which will sooner or later be removed. The public tojagged, with no underscore, has a different definition and is intended to be maintained.)

0.8.8

09 Mar 09:11
bd803ca
Compare
Choose a tag to compare

All array types have an nbytes parameter, which determines eviction from uproot's ArrayCache. Without this parameter, the cache would fill up to a billion arrays rather than a billion bytes!

The nbytes parameter only counts data in arrays, not the Python objects that support those arrays (which differs between Pythons 2 and 3, and PyPy doesn't track), and it doesn't track ephemeral attributes, even if they are arrays (like JaggedArray._counts, which only exists after the first time JaggedArray.counts is requested). It also doesn't make a distinction between owned data and not-owned data, so views would be double-counted.

The nbytes algorithm always halts, even if structures have cyclic references (if x.content is x, the nbytes of x are not double-counted and do not lead to infinite recursion).

0.8.7

08 Mar 15:59
1681251
Compare
Choose a tag to compare

This release adds awkward.toarrow and awkward.toparquet, renaming old functions to awkward.fromarrow and awkward.fromparquet for symmetry. They can only be used if you have pyarrow installed, which is not a strict dependency (must be explicitly installed). String columns can be converted from Arrow to Awkward, but not from Awkward to Arrow because of an open question (see comments).

The implemented conversion is really just between Awkward and Arrow, letting pyarrow convert to and from Parquet.

Top-level Awkward Tables (possibly under ChunkedArray or any MaskedArray) are converted into Arrow Tables, but deeper Awkward Tables are converted into Arrow StructArrays.

Arrow arrays with an associated mask adds a BitMaskedArray to the Awkward structure. All Awkward MaskedArrays are pushed down to the deepest Arrow level that can accept them. This might not be necessary—a better understanding of how to generate Arrow buffers might make this unnecessary.

Python types in Awkward ObjectArrays can't be saved to Arrow, as it's a multilingual serialization system.

Awkward VirtualArrays are evaluated before converting to Arrow. When reading from Parquet, all columns of all chunks are presented as Awkward VirtualArrays so that they may be lazily read. By default, Awkward VirtualArrays are read-once: the VirtualArray object maintains a reference to the materialized array. That's good for multiple reading performance, but bad for memory use. The cache parameter of fromparquet lets you pass a dict-like cache, such as from the cachetools library.

Awkward ChunkedArrays become RecordBatches in a Table in toarrow but separate Tables in toparquet. When reading fromparquet, the separate Tables define the level of granularity for incremental reading.

If toparquet is given an iterable of Awkward data, it will incrementally write the Parquet file. The same can be achieved by an Awkward ChunkedArray of Tables of VirtualArray, which is what fromparquet returns, so the output of fromparquet can be used as input to toparquet.