activereader.base module

Provides the tools to build classes corresponding to XML elements in GPX and TCX files.

class activereader.base.ActivityElement(lxml_elem)

Base class for creating compositions with _Element.

TCX and GPX files are both XML documents that follow their own particular schema. ActivityElement subclasses provide access to the underlying XML elements’ data, attributes, descendent elements.

Parameters

lxml_elem (_Element) – XML element object that has been read in by the lxml.etree package. The tag name of the element must match the TAG attribute of this class, or else a TypeError will be raised.

TAG = 'element'

XML tag name of the element.

If an instance of the class is initialized from a lxml.etree._Element with any other tag name, a TypeError will be raised.

classmethod _add_attr_properties(**property_keys_types)

Add properties to cls that access specific keys using get_attr().

Parameters
  • cls – Class to add the properties to.

  • **property_keys_types

    kwargs mapping property_name : tuple(key, data_type)

    • property_name will be added to cls.

    • key is the attribute name within the contained lxml element whose text will be retrieved.

    • data_type is the type that the subelement’s text will be converted to. A python type, or datetime.datetime to read a timestamp using dateutil.parser.isoparse().

Examples

>>> class MyElement(ActivityElement):
...   [...]
>>> MyElement._add_attr_properties(start_time=('StartTime', datetime.datetime))
classmethod _add_data_properties(**property_paths_types)

Add properties to cls that access specific paths using get_data().

Parameters
  • cls – Class to add the properties to.

  • **property_paths_types

    kwargs mapping property_name : tuple(path, data_type)

    • property_name will be added to cls.

    • path is the tag name or path of the desired subelement of the contained lxml element whose text will be retrieved

    • data_type is the type that the subelement’s text will be converted to. A python type, or datetime.datetime to read a timestamp using dateutil.parser.isoparse().

Examples

>>> class MyElement(ActivityElement):
...   [...]
>>> MyElement._add_data_properties(time=('Time', datetime.datetime))
classmethod _add_descendent_properties(**descendent_classes)

Add properties to cls that return all descendents matching a path.

Parameters
  • cls – Class to add the properties to.

  • **descendent_classes

    kwargs mapping property_name : descendent class.

    • property_name will be added to cls.

    • All values must be a subclass of ActivityElement.

Examples

>>> class MyElement(ActivityElement):
...   [...]
>>> MyElement._add_descendent_properties(subelements=MySubElement)
get_attr(key, conv_type=<class 'str'>)

Retrieve data using contained lxml element’s get() and convert its type.

Parameters
  • key (str) – The attribute name within the contained lxml element.

  • conv_type (data type) – Data type that the attribute text will be converted to. Python type, or datetime.datetime to read a time using dateutil.parser.isoparse(). Defaults to str.

Returns

The retrieved attribute data in the requested type, or None if the contained lxml element does not have an attribute named key.

Examples

>>> e = ActivityElement(etree.fromstring(
...   '<element StartTime="2021-02-26T19:51:07.000Z"></element>
... ))
>>> e.get_attr('StartTime', conv_type=datetime.datetime)
datetime.datetime(2021, 2, 26, 19, 51, 7, tzinfo=tzutc())
get_data(path, conv_type=<class 'str'>)

Retrieve data using the contained lxml element’s findtext() and convert its type.

Parameters
  • path (str) – The tag name or path of the desired subelement of the contained lxml element whose text will be retrieved. Text will be retrieved from the first matching subelement.

  • conv_type (data type) – Data type that the element text will be converted to. Python type, or datetime.datetime to read a time using dateutil.parser.isoparse(). Defaults to str.

Returns

The retrieved data in the requested type, or None if no subelements match path.

Examples

Text can be converted to another Python type:

>>> e_lat = ActivityElement(etree.fromstring(
...   '<element><Position><LatitudeDegrees>40.0</LatitudeDegrees></Position></element>'
... ))
>>> e_lat.get_data('Position/LatitudeDegrees', conv_type='float')
40.0

Or a timestamp can be read in:

>>> e_time = ActivityElement(etree.fromstring(
...   '<element><Time>2021-02-26T19:51:08.000Z</Time></element>'
... ))
>>> e_time.get_data('Time', conv_type=datetime.datetime)
datetime.datetime(2021, 2, 26, 19, 51, 8, tzinfo=tzutc())
class activereader.base.XmlReader(filepath_or_buffer, ext='XML')

XmlReader provides an interface for reading in a XML file (eg GPX, TCX).

_get_data_from_filepath(filepath_or_buffer)
The method {reader}.from_file accepts four input types:
  1. filepath (string-like)

  2. bytes

  3. file-like object (e.g. open file object, StringIO, BytesIO)

  4. GPX string

This method turns (1) and (2) into (3) to simplify the rest of the processing. It returns input types (3) and (4) unchanged.

Raises FileNotFoundError if the input is a string ending in .{ext} but no such file exists.

Ref:

https://github.com/pandas-dev/pandas/blob/v1.5.1/pandas/io/json/_json.py#L837

_preprocess_data(data)

At this point, the data either has a read attribute (e.g. an open file object, a StringIO, or a BytesIO) or is a string that is a XML document. Any of these are acceptable inputs to lxml.etree.parse(), so this method does not change the data currently.

Ref:

https://github.com/pandas-dev/pandas/blob/v1.5.1/pandas/io/json/_json.py#L821

read()

Read the whole input into a lxml.etree._Element

activereader.base.add_xml_attr(**property_keys_types)

Add properties to a class that access data using get_attr().

This provides an alternative usage to directly calling ActivityElement._add_attr_properties() below a class definition.

Parameters

**property_keys_types

kwargs mapping property_name : tuple(key, data_type)

  • property_name will be added to the decorated class.

  • key is the attribute name within the contained lxml element whose text will be retrieved.

  • data_type is the type that the subelement’s text will be converted to. A python type, or datetime.datetime to read a timestamp using dateutil.parser.isoparse().

Returns

A class decorator.

Return type

callable

Examples

>>> @add_xml_attr(start_time=('StartTime', datetime.datetime))
... class MyElement(ActivityElement):
...   [...]
activereader.base.add_xml_data(**property_paths_types)

Add properties to a class that access data using get_data().

This provides an alternative usage to directly calling ActivityElement._add_data_properties() below a class definition.

Parameters
  • **property_paths_types – kwargs mapping property_name : tuple(path, data_type)

  • class. (- property_name will be added to the decorated) –

  • the (- path is the tag name or path of the desired subelement of) – contained lxml element whose text will be retrieved

  • converted (- data_type is the type that the subelement's text will be) – to. A python type, or datetime.datetime to read a timestamp using dateutil.parser.isoparse().

Returns

A class decorator.

Return type

callable

Examples

>>> @add_xml_data(time=('Time', datetime.datetime))
... class MyElement(ActivityElement):
...   [...]
activereader.base.add_xml_descendents(**descendent_classes)

Add properties to a class that return all descendents matching a path.

This provides an alternative usage to directly calling ActivityElement._add_descendent_properties() below a class definition.

Parameters

**descendent_classes

kwargs mapping property_name : descendent class.

  • Each key will be a property name added to the decorated class.

  • All values must be a subclass of ActivityElement.

Returns

A class decorator

Return type

callable

Examples

>>> @add_xml_descendents(subelements=MySubElement)
... class MyElement(ActivityElement):
...   [...]
activereader.base.create_attr_prop(key, conv_type=<class 'int'>)

Add property inside an ActivityElement class definition that accesses data using get_attr().

Parameters
  • key (str) – The attribute name within the contained lxml element.

  • conv_type (data type) – Data type that the element text will be converted to. Python type, or datetime.datetime to read a time using dateutil.parser.isoparse(). Defaults to int.

Returns

accessor for underlying lxml element’s attribute data.

Return type

property

Examples

>>> class MyElement(ActivityElement):
...   my_prop = create_attr_prop('Time', datetime.datetime)
...   """Property docstring goes here (if desired)"""
...   [...]
activereader.base.create_data_prop(path, conv_type=<class 'int'>)

Add property inside an ActivityElement class definition that accesses data using get_data().

Parameters
  • path (str) – The tag name or path of the desired subelement of the contained lxml element whose text will be retrieved. Text will be retrieved from the first matching subelement.

  • conv_type (data type) – Data type that the element text will be converted to. Python type, or datetime.datetime to read a time using dateutil.parser.isoparse(). Defaults to int.

Returns

accessor for the underlying lxml element’s data.

Return type

property

Examples

>>> class MyElement(ActivityElement):
...   my_prop = create_data_prop('Time', datetime.datetime)
...   """Property docstring goes here (if desired)."""
...   [...]
activereader.base.create_descendent_prop(descendent_class)

Add descendent list accessor property inside an ActivityElement class definition.

The property returns all the current element’s descendents matching the tag name defined in the descendent class.

Parameters

descendent_class (ActivityElement) – Must be a subclass of ActivityElement.

Returns

accessor for underlying lxml element’s descendent list.

Return type

property

Examples

>>> class MyElement(ActivityElement):
...   my_prop = create_descendent_prop(MySubElement)
...   """Property docstring goes here (if desired)"""
...   [...]