Julia package to read, write and detect streams of formatted data.
The FormatStreams
module provides the streamf
function to wrap files and IO
streams that contain sequences of formatted objects. streamf
returns a
FormattedStream
object that can be used with the standard IO functions, such
as read
, write
, seek
or close
. This makes it easy to read/write
sequences of objects according to MIME types. FormatStreams
extends Formats
and will automatically detect file formats/codings. Convenience functions are
also provided to iterate over the contents of formatted streams.
You can use Formats under the terms of the MIT “Expat” License; see
LICENSE.md
.
FormatStreams is not a registered package. You can add it to your Julia environment by giving the URL to its repository:
using Pkg
Pkg.add(PackageSpec(url="https:://github.com/ofisette/FormatStreams.jl"))
This documentation gives an overview of the types and functions that form FormatStreams’s public interface. For details, refer to the documentation of individual functions and types, available in the REPL. The basic usage section of the documentation is also accessible from the REPL:
?FormatStreams
FormatStreams provides a basic framework to manage streams of formatted objects. However, FormatStreams itself does not define any specific format or function to read or write objects in specific formats. All examples below rely on Dorothy for reading/writing TRR and XTC molecular trajectories.
using Formats # Framework for data formats/codings (separate package)
using FormatCodecs # Codecs for common codings (separate package)
using FormatStreams # Streams of formatted data (this package)
To open a file as a stream of formatted objects, automatically inferring its format/coding if unspecified:
s = streamf("trajectory.xtc")
An existing IO stream can be wrapped as a stream of formatted objects:
io = open("trajectory.xtc")
s = streamf(io)
Format/coding can be specified using functions from the Formats
package:
s = streamf(specify("lysozyme.dat", "trajectory/x-trr"))
The FormattedStream
returned by streamf
can be used with standard IO
functions. Basic functions are always supported:
s = streamf("trajectory.xtc")
position(s)
eof(s)
seekstart(s)
close(s)
The read
function is typically supported, allowing sequential reading:
s = streamf("trajectory.xtc")
while !eof(s)
frame = read(s)
[...]
end
close(s)
Function read!
may be supported to read into a pre-allocated output buffer:
s = streamf("trajectory.xtc")
frame = MolecularModel()
while !eof(s)
read!(s, frame)
[...]
end
close(s)
Some formatted streams are seekable:
s = streamf("trajectory.trr")
seek(s, 500)
seekend(s)
Some have a known length:
n = length(s1)
Some allow replacing and/or appending values with write
:
s1 = streamf("input.xtc")
s2 = streamf(openf("output.trr", "w"))
frame = MolecularModel()
while !eof(s1)
read!(s1, frame)
write(s2, frame)
end
close(s1)
close(s2)
And some can be truncated:
truncate(s, 5)
The streamf
function supports do
block syntax, automatically closing the
formatted stream upon completion:
streamf("trajectory.xtc") do s
frame = MolecularModel()
while !eof(s)
read!(s, frame)
[...]
end
end
The convenience function eachval
iterates over all values in a formatted
streamf:
s = streamf("trajectory.xtc")
for frame in eachval(s)
[...]
end
The eachval!
function iterates over a stream by reading into a pre-allocated
output buffer. Combined with a do
block, it provides a compact and efficient
way to iterate over formatted objects:
streamf("trajectory.xtc") do s
for frame in eachval!(s, MolecularModel())
[...]
end
end
FormatStreams
provides streamf
as the single way to open a stream of
formatted objects. streamf
operates on filenames, IO streams and Formatted
objects (usually created with the infer
and specify
functions from the
Formats
package).
Objects returned by streamf
specialize the FormattedStream
abstract type.
These always support basic IO functions position
, eof
, seekstart
and
close
. In addition, they usually support read
, and may also support read!
,
seek
, seekend
, length
, write
or truncate
.
Convenience functions eachval
and eachval!
provide an easy way to wrap
formatted streams and treat them as iterators.
To integrate your own packages with FormatStreams
, you first need to define
the formats you wish to support, register them, and define a singleton type to
identify your IO streamer. See the documentation of Formats
for details.
You should then register your streamer using addstreamer
(not exported by
default). Finally, you must specialize streamf
; the specific signature is
documented. The object you return from streamf
should specialize
FormattedStream
and support the IO functions listed above.
Calls to addstreamer
should take place in your package’s __init__
function
since they modify a global variable. See the documentation of Formats
for a
more detailed explanation.
When multiple streamers are available for a given format, a specific streamer
can be selected, either for a single format or on a global basis. This is
done via preferstreamer
(not exported by default).
- Formats: Read, write and detect formatted data, based on MIME types (dependency).