Skip to main content

HTML Tutorial

This functionality requires an experimental html plugin. Read More

Frictionless supports parsing HTML format:

CLI
pip install frictionless[html]
pip install 'frictionless[html]' # for zsh shell

Reading Data#

You can this file format using Package/Resource, for example:

Python
from pprint import pprint
from frictionless import Resource
resource = Resource(path='data/table1.html')
pprint(resource.read_rows())
[{'id': 1, 'name': 'english'}, {'id': 2, 'name': '中国人'}]

Writing Data#

The same is actual for writing:

Python
from frictionless import Resource
source = Resource(data=[['id', 'name'], [1, 'english'], [2, 'german']])
target = source.write('table.html')
print(target)
print(target.to_view())
{'path': 'table.html'}
+----+-----------+
| id | name |
+====+===========+
| 1 | 'english' |
+----+-----------+
| 2 | 'german' |
+----+-----------+

Configuring Data#

There is a dialect to configure HTML, for example:

Python
from frictionless import Resource
from frictionless.plugins.html import HtmlDialect
resource = Resource(path='data/table1.html', dialect=HtmlDialect(selector='#id'))
print(resource.read_rows())

References: