There are some points to note about the nomenclature that the pypond
and pond
code bases use to refer to the “columns of data” in the time series event objects. This TimeSeries
:
DATA = dict(
name="traffic",
columns=["time", "value", "status"],
points=[
[1400425947000, 52, "ok"],
[1400425948000, 18, "ok"],
[1400425949000, 26, "fail"],
[1400425950000, 93, "offline"]
]
)
contains two columns: value
and status
.
However this TimeSeries
:
DATA_FLOW = dict(
name="traffic",
columns=["time", "direction"],
points=[
[1400425947000, {'in': 1, 'out': 2}],
[1400425948000, {'in': 3, 'out': 4}],
[1400425949000, {'in': 5, 'out': 6}],
[1400425950000, {'in': 7, 'out': 8}]
]
)
contains only one column direction
, but that column has two more columns - in
and out
- nested under it. In the following examples, these nested columns will be referred to as “deep paths.”
When specifying columns to the methods that set, retrieve and manipulate data, we use two argument types: field_spec
and field_path
. They are similar yet different enough to warrant this document.
A field_path
refers to a single column in a series. Any method that takes a field_path
as an argument only acts on one column at a time. The value passed to this argument can be either a string, a list or None
.
When a string is passed, it can be one of the following formats:
DATA
example above, this would be either value
or status
.DATA_FLOW
example above, the incoming data could be retrieved with direction.in
as the field_path
.When a list
(or tuple
) is passed as a field_path
, each element in the iterable is a single segment of the path to a column. So to compare with the string examples:
['value']
would be equivalent to the string value
.['direction', 'in']
would be equivalent to the string direction.in
.This is particularly important to note because this behavior is different than passing a list to a field_spec
arg.
None
¶If no field_path
is specified (defaulting to None
), then the default column value
will be used.
A field_spec
refers to one or more columns in a series. When a method takes a field_spec
, it may act on multiple columns in a TimeSeries
. The value passed to this argument can be either a string, a list or None
.
The string variant is essentially identical to the field_path
string variant - it is a path to a single column of one of the following formats:
DATA
example above, this would be either value
or status
.DATA_FLOW
example above, the incoming data could be retrieved with direction.in
as the field_path
.Passing a list
(or tuple
) to field_spec
is different than the aforementioned behavior in that it is explicitly referring to one or more columns. Rather than each element being segments of a path, each element is a full path to a single column.
Using the previous examples:
['in', 'out']
would act on both the in
and out
columns from the DATA
example.['direction.in', 'direction.out']
- here each element is a fully formed “deep path” to the two data columns in the DATA_FLOW
example.The lists do not have to have more than one element: ['value'] == 'value'
.
NOTE: accidentally passing this style of list to an arg that is actually a field_path
will most likely result in an EventException
being raised. Passing something like ['in', 'out']
as a field_path
will attempt to retrieve the nested column in.out
which probably doesn’t exist.
None
¶If no field_spec
is specified (defaulting to None
), then the default column value
will be used.
This is a less common variant of the field_spec
. It is used in the case where the method is going to specifically act on multiple columns. Otherwise, it’s usage is identical to the list variant of field_spec
.