Layout Guide
The Layout concept give us an ability to manage table header and pick/skip arbitrary fields and rows from the raw data stream.
#
Layout UsageThe Layout class instance are accepted by many classes and functions:
- Resource
- describe
- extract
- validate
- and more
You just need to create a Layout instance using desired options and pass to the classed and function from above.
#
Layout OptionsLet's list all the available Layout options with simple usage examples:
#
HeaderIt's a boolean flag which defaults to True
indicating whether the data has a header row or not. In the following example the header row will be treated as a data row:
#
Header RowsIf header is True
which is default, this parameters indicates where to find the header row or header rows for a multiline header. Let's see on example how the first two data rows can be treated as a part of a header:
#
Header JoinIf there are multiple header rows which is managed by header_rows
parameter, we can set a string to be a separator for a header's cell join operation. Usually it's very handy for some "fancy" Excel files. For the sake of simplicity, we will show on a CSV file:
#
Header CaseBy default a header is validated in a case sensitive mode. To disable this behaviour we can set the header_case
parameter to False
. This option is accepted by any Layout and a dialect can be passed to extract
, validate
and other functions. Please note that it doesn't affect a resulting header it only affects how it's validated:
#
Pick/Skip FieldsWe can pick and skip arbitrary fields based on a header row. These options accept a list of field numbers, a list of strings or a regex to match. All the queries below do the same thing for this file:
#
Limit/Offset FieldsThere are two options that provide an ability to limit amount of fields similar to SQL's directives:
#
Pick/Skip RowsIt's alike the field counterparts but it will be compared to the first cell of a row. All the queries below do the same thing for this file but take into account that when picking we need to also pick a header row. In addition, there is special value <blank>
that matches a row if it's completely blank:
#
Limit/Offset RowsThis is a quite popular option used to limit amount of rows to read: