Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement HSD-wrappers to manipulate nested content #25

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
*~
.idea
.env
.vscode
*.pyc
dist
build
Expand Down
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@
# a list of builtin themes.
#
# html_theme = 'alabaster'
html_theme = 'sphinx_rtd_theme'
html_theme = 'sphinx_book_theme'

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
Expand Down
6 changes: 3 additions & 3 deletions docs/hsd.rst
Original file line number Diff line number Diff line change
Expand Up @@ -205,7 +205,7 @@ As an example, let's store the input from the previous section ::
in the file `test.hsd`, parse it and convert the node names to lower case
(to make enable case-insensitive input processing). Using the Python command ::

inpdict = hsd.load("test.hsd", lower_tag_names=True, include_hsd_attribs=True)
inpdict = hsd.load("test.hsd", lower_names=True, save_hsd_attribs=True)

will yield the following dictionary representation of the input::

Expand Down Expand Up @@ -236,7 +236,7 @@ The node names and formatting information about the equal sign ensures
that the formatting is similar to the original HSD, if the data is dumped
into the HSD format again. Dumping the dictionary with ::

hsd.dump(inpdict, "test2-formatted.hsd", use_hsd_attribs=True)
hsd.dump(inpdict, "test2-formatted.hsd", apply_hsd_attribs=True)

would indeed yield ::

Expand All @@ -251,7 +251,7 @@ which is basically identical with the original input. If the additional
processing information is not recorded when the data is loaded, or
it is not considered when the data is dumped as HSD again ::

inpdict = hsd.load("test.hsd", lower_tag_names=True)
inpdict = hsd.load("test.hsd", lower_names=True)
hsd.dump(inpdict, "test2-unformatted.hsd")

the resulting formatting will more strongly differ from the original HSD ::
Expand Down
87 changes: 87 additions & 0 deletions docs/introduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,9 @@ or into the user space issueing ::
Quick tutorial
==============

The basics
----------

A typical, self-explaining input written in HSD looks like ::

driver {
Expand Down Expand Up @@ -117,3 +120,87 @@ Python ::
and then stored again in HSD format ::

hsd.dump(hsdinput, "test2.hsd")



Accesing nested data structures via wrappers
--------------------------------------------

The hsd module contains lightweight wrappers (``HsdDict``, ``HsdList`` and
``HsdValue``), which offer convenient access to entries in nested data
structures. With the help of these wrappers, nested nodes and values can be
directly accessed using paths. When accessing nested content via wrappers, the
resulting objects will be wrappers themself, wrapping the appropriate parts of
the data structure (and inheriting certain properties of the original wrapper).

For example, reading and wrapping the example above::

import hsd
hsdinp = hsd.wrap(hsd.load("test.hsd"))

creates an ``HsdDict`` wrapper instance (``hsdinp``), which can be used to query
encapsulated information in the structure::

# Reading out the value directly (100)
maxsteps = hsdinp["driver", "conjugate_gradients", "max_steps"].value

# Storing wrapper (HsdValue) instance and reading out value and the attribute
temp = hsdinp["hamiltonian / dftb / filling / fermi / temperature"]
temp_value = temp.value
temp_unit = temp.attrib

# Getting a default value, if a given path does not exists:
pot = hsdinp.get_item("hamiltonian / dftb / bias", default=hsd.HsdValue(100, attrib="V"))

# Setting a value for given path by creating missing parents
hsdinp.set_item("analysis / calculate_forces", True, parents=True)

# Getting a value at a path, or default value, if given path does not exist.
# In latter case, path should be created (incl. missing parents) and set to default value.
has_mulliken = hsdinp.set_default(
"analysis / mullikenanalyis", default=hsd.HsdValue(True), parents=True
).value

As demonstrated above, paths can be specified as tuples or as slash (``/``) joined strings.

The wrappers also support case-insensitive access. Let's have a look at a
mixed-case example file ``test2.hsd``::

Driver {
ConjugateGradients {
MovedAtoms = 1 2 "7:19"
MaxSteps = 100
}

We now make copy of the data structure before wrapping it, and make sure that
all keys are converted to lower case, but the original names are saved as
HSD-attributes::

hsdinp = hsd.copy(hsd.load("test2.hsd"), lower_names=True, save_names=True)

This way, paths passed to the Hsd-wrapper are treated in a case-insensitive
way::

maxsteps = hsdinp["driver", "CONJUGATEGRADIENTS", "MAXSTEPS"].value

When adding new items, the access is and remains case in-sensitive, but the
actual form of the name of the new node will be saved. The code snippet::

hsdinp["driver", "conjugategradients", "MaxForce"] = hsd.HsdValue(1e-4, attrib="au")
maxforceval = hsdinp["driver", "conjugategradients", "maxforce"]
print(f"{maxforceval.value} {maxforceval.attrib}")
print(hsd.dump_string(hsdinp.value, apply_hsd_attribs=True))

will result in ::

0.0001 au
Driver {
ConjugateGradients {
MovedAtoms = 1 2 "7:19"
MaxSteps = 100
MaxForce [au] = 0.0001
}
}

where the case-convention for ``MaxForce`` is identical to the one used when the
item was created.
6 changes: 3 additions & 3 deletions src/hsd/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@
"""
Toolbox for reading, writing and manipulating HSD-data.
"""
from hsd.common import HSD_ATTRIB_LINE, HSD_ATTRIB_EQUAL, HSD_ATTRIB_SUFFIX,\
HSD_ATTRIB_NAME, HsdError
from hsd.dict import HsdDictBuilder, HsdDictWalker
from hsd.common import HSD_ATTRIB_LINE, HSD_ATTRIB_EQUAL, HSD_ATTRIB_NAME, HsdError
from hsd.dict import ATTRIB_KEY_SUFFIX, HSD_ATTRIB_KEY_SUFFIX, HsdDictBuilder, HsdDictWalker
from hsd.eventhandler import HsdEventHandler, HsdEventPrinter
from hsd.formatter import HsdFormatter
from hsd.io import load, load_string, dump, dump_string
from hsd.parser import HsdParser
from hsd.wrappers import HsdDict, HsdList, HsdValue, copy, wrap

__version__ = '0.1'
6 changes: 0 additions & 6 deletions src/hsd/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,6 @@ def unquote(txt):
# Name for default attribute (when attribute name is not specified)
DEFAULT_ATTRIBUTE = "unit"

# Suffix to mark attribute
ATTRIB_SUFFIX = ".attrib"

# Suffix to mark hsd processing attributes
HSD_ATTRIB_SUFFIX = ".hsdattrib"

# HSD attribute containing the original tag name
HSD_ATTRIB_NAME = "name"

Expand Down
49 changes: 28 additions & 21 deletions src/hsd/dict.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,17 @@
"""
import re
from typing import List, Tuple, Union
from hsd.common import HSD_ATTRIB_NAME, np, ATTRIB_SUFFIX, HSD_ATTRIB_SUFFIX, HsdError,\
QUOTING_CHARS, SPECIAL_CHARS
from hsd.common import HSD_ATTRIB_NAME, np, HsdError, QUOTING_CHARS, SPECIAL_CHARS
from hsd.eventhandler import HsdEventHandler, HsdEventPrinter


# Dictionary key suffix to mark attribute
ATTRIB_KEY_SUFFIX = ".attrib"

# Dictionary keysuffix to mark hsd processing attributes
HSD_ATTRIB_KEY_SUFFIX = ".hsdattrib"


_ItemType = Union[float, complex, int, bool, str]

_DataType = Union[_ItemType, List[_ItemType]]
Expand Down Expand Up @@ -69,24 +76,24 @@ class HsdDictBuilder(HsdEventHandler):
flatten_data: Whether multiline data in the HSD input should be
flattened into a single list. Othewise a list of lists is created, with one list for
every line (default).
lower_tag_names: Whether tag names should be all converted to lower case (to ease case
insensitive processing). Default: False. If set and include_hsd_attribs is also set,
lower_names: Whether tag names should be all converted to lower case (to ease case
insensitive processing). Default: False. If set and save_hsd_attribs is also set,
the original tag names can be retrieved from the "name" hsd attributes.
include_hsd_attribs: Whether the HSD-attributes (processing related attributes, like
save_hsd_attribs: Whether the HSD-attributes (processing related attributes, like
original tag name, line information, etc.) should be stored (default: False).
"""

def __init__(self, flatten_data: bool = False, lower_tag_names: bool = False,
include_hsd_attribs: bool = False):
def __init__(self, flatten_data: bool = False, lower_names: bool = False,
save_hsd_attribs: bool = False):
super().__init__()
self._hsddict: dict = {}
self._curblock: dict = self._hsddict
self._parentblocks: List[dict] = []
self._data: Union[None, _DataType] = None
self._attribs: List[Tuple[str, dict]] = []
self._flatten_data: bool = flatten_data
self._lower_tag_names: bool = lower_tag_names
self._include_hsd_attribs: bool = include_hsd_attribs
self._lower_names: bool = lower_names
self._save_hsd_attribs: bool = save_hsd_attribs


@property
Expand All @@ -107,7 +114,7 @@ def open_tag(self, tagname, attrib, hsdattrib):
def close_tag(self, tagname):
attrib, hsdattrib = self._attribs.pop(-1)
parentblock = self._parentblocks.pop(-1)
key = tagname.lower() if self._lower_tag_names else tagname
key = tagname.lower() if self._lower_names else tagname
prevcont = parentblock.get(tagname)

if self._data is not None:
Expand All @@ -130,26 +137,26 @@ def close_tag(self, tagname):
parentblock[key] = [{None: prevcont}, self._curblock]

if attrib and prevcont is None:
parentblock[key + ATTRIB_SUFFIX] = attrib
parentblock[key + ATTRIB_KEY_SUFFIX] = attrib
elif prevcont is not None:
prevattrib = parentblock.get(key + ATTRIB_SUFFIX)
prevattrib = parentblock.get(key + ATTRIB_KEY_SUFFIX)
if isinstance(prevattrib, list):
prevattrib.append(attrib)
else:
parentblock[key + ATTRIB_SUFFIX] = [prevattrib, attrib]
parentblock[key + ATTRIB_KEY_SUFFIX] = [prevattrib, attrib]

if self._include_hsd_attribs:
if self._lower_tag_names:
if self._save_hsd_attribs:
if self._lower_names:
hsdattrib = {} if hsdattrib is None else hsdattrib
hsdattrib[HSD_ATTRIB_NAME] = tagname
if prevcont is None:
parentblock[key + HSD_ATTRIB_SUFFIX] = hsdattrib
parentblock[key + HSD_ATTRIB_KEY_SUFFIX] = hsdattrib
else:
prevhsdattrib = parentblock.get(key + HSD_ATTRIB_SUFFIX)
prevhsdattrib = parentblock.get(key + HSD_ATTRIB_KEY_SUFFIX)
if isinstance(prevhsdattrib, list):
prevhsdattrib.append(hsdattrib)
else:
parentblock[key + HSD_ATTRIB_SUFFIX] = [prevhsdattrib, hsdattrib]
parentblock[key + HSD_ATTRIB_KEY_SUFFIX] = [prevhsdattrib, hsdattrib]
self._curblock = parentblock
self._data = None

Expand Down Expand Up @@ -219,11 +226,11 @@ def walk(self, dictobj):

for key, value in dictobj.items():

if key.endswith(ATTRIB_SUFFIX) or key.endswith(HSD_ATTRIB_SUFFIX):
if key.endswith(ATTRIB_KEY_SUFFIX) or key.endswith(HSD_ATTRIB_KEY_SUFFIX):
continue

hsdattrib = dictobj.get(key + HSD_ATTRIB_SUFFIX)
attrib = dictobj.get(key + ATTRIB_SUFFIX)
hsdattrib = dictobj.get(key + HSD_ATTRIB_KEY_SUFFIX)
attrib = dictobj.get(key + ATTRIB_KEY_SUFFIX)

if isinstance(value, dict):

Expand Down
10 changes: 5 additions & 5 deletions src/hsd/formatter.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,14 @@ class HsdFormatter(HsdEventHandler):

Args:
fobj: File like object to write the formatted output to.
use_hsd_attribs: Whether HSD attributes passed to the formatter should
apply_hsd_attribs: Whether HSD attributes passed to the formatter should
be considered, when formatting the the output (default: True)
"""

def __init__(self, fobj, use_hsd_attribs=True):
def __init__(self, fobj, apply_hsd_attribs=True):
super().__init__()
self._fobj: TextIO = fobj
self._use_hsd_attribs: bool = use_hsd_attribs
self._apply_hsd_attribs: bool = apply_hsd_attribs
self._level: int = 0
self._indent_level: int = 0
# Whether last node on current level should/was followed by an
Expand Down Expand Up @@ -61,7 +61,7 @@ def open_tag(self, tagname: str, attrib: str, hsdattrib: dict):
else:
indentstr = self._indent_level * _INDENT_STR

if self._use_hsd_attribs and hsdattrib is not None:
if self._apply_hsd_attribs and hsdattrib is not None:
tagname = hsdattrib.get(HSD_ATTRIB_NAME, tagname)

self._fobj.write(f"{indentstr}{tagname}{attribstr}")
Expand All @@ -74,7 +74,7 @@ def open_tag(self, tagname: str, attrib: str, hsdattrib: dict):
self._level += 1

equal = None
if hsdattrib is not None and self._use_hsd_attribs:
if hsdattrib is not None and self._apply_hsd_attribs:
equal = hsdattrib.get(HSD_ATTRIB_EQUAL)
self._followed_by_equal.append(equal)

Expand Down
Loading