# Data columns: field_spec and field_path There are some points to note about the nomenclature that the `pypond` and `pond` code bases use to refer to the "columns of data" in the time series event objects. This `TimeSeries`: ``` DATA = dict( name="traffic", columns=["time", "value", "status"], points=[ [1400425947000, 52, "ok"], [1400425948000, 18, "ok"], [1400425949000, 26, "fail"], [1400425950000, 93, "offline"] ] ) ``` contains two columns: `value` and `status`. However this `TimeSeries`: ``` DATA_FLOW = dict( name="traffic", columns=["time", "direction"], points=[ [1400425947000, {'in': 1, 'out': 2}], [1400425948000, {'in': 3, 'out': 4}], [1400425949000, {'in': 5, 'out': 6}], [1400425950000, {'in': 7, 'out': 8}] ] ) ``` contains only one column `direction`, but that column has two more columns - `in` and `out` - nested under it. In the following examples, these nested columns will be referred to as "deep paths." When specifying columns to the methods that set, retrieve and manipulate data, we use two argument types: `field_spec` and `field_path`. They are similar yet different enough to warrant this document. ## field_path A `field_path` refers to a **single** column in a series. Any method that takes a `field_path` as an argument only acts on one column at a time. The value passed to this argument can be either a string, a list or `None`. ### String variant When a string is passed, it can be one of the following formats: * simple path - the name of a single "top level" column. In the `DATA` example above, this would be either `value` or `status`. * deep path - the path pointing to a single nested columns with each "segment" of the path delimited with a period. In the `DATA_FLOW` example above, the incoming data could be retrieved with `direction.in` as the `field_path`. ### List variant When a `list` (or `tuple`) is passed as a `field_path`, each element in the iterable is **a single segment of the path to a column**. So to compare with the string examples: * `['value']` would be equivalent to the string `value`. * `['direction', 'in']` would be equivalent to the string `direction.in`. This is particularly important to note because **this behavior is different** than passing a list to a `field_spec` arg. ### `None` If no `field_path` is specified (defaulting to `None`), then the default column `value` will be used. ## field_spec A `field_spec` refers to **one or more** columns in a series. When a method takes a `field_spec`, it may act on multiple columns in a `TimeSeries`. The value passed to this argument can be either a string, a list or `None`. ### String variant The string variant is essentially identical to the `field_path` string variant - it is a path to a single column of one of the following formats: * simple path - the name of a single "top level" column. In the `DATA` example above, this would be either `value` or `status`. * deep path - the path pointing to a single nested columns with each "segment" of the path delimited with a period. In the `DATA_FLOW` example above, the incoming data could be retrieved with `direction.in` as the `field_path`. ### List variant Passing a `list` (or `tuple`) to `field_spec` is different than the aforementioned behavior in that it is explicitly referring to **one or more columns**. Rather than each element being segments of a path, **each element is a full path to a single column**. Using the previous examples: * `['in', 'out']` would act on both the `in` and `out` columns from the `DATA` example. * `['direction.in', 'direction.out']` - here each element is a fully formed "deep path" to the two data columns in the `DATA_FLOW` example. The lists do not have to have more than one element: `['value'] == 'value'`. NOTE: accidentally passing this style of list to an arg that is actually a `field_path` will most likely result in an `EventException` being raised. Passing something like `['in', 'out']` as a `field_path` will attempt to retrieve the nested column `in.out` which probably doesn't exist. ### `None` If no `field_spec` is specified (defaulting to `None`), then the default column `value` will be used. ## field_spec_list This is a less common variant of the `field_spec`. It is used in the case where the method is going to **specifically act on multiple columns**. Otherwise, it's usage is identical to the list variant of `field_spec`.