Parsers
Notional includes several parsers for importing external content. They will accept either string (data) or file-like objects to provide the input content.
HTML Parser
The HTML parser read an HTML document into Notion API objects. From there, the caller may create a page in Notion using the rendered content.
from notional.parser import HtmlParser
parser = HtmlParser(base="https://www.example.com/")
with open(filename, "r") as fp:
parser.parse(fp)
doc = notion.pages.create(
parent=parent_page,
title=parser.title,
children=parser.content,
)
Note: while the parser aims to be general purpose, there may be conditions where it cannot interpret the HTML document. Please submit an issue if you find an example of valid HTML that is not properly converted.
After parsing, the HtmlParser
will contain title
, meta
, and content
.
HtmlParser.title
If the parser encounters a <title>
element, this property will be set to the contents.
Otherwise, the parser will attempt to look for a name
in the input data stream.
Typically, this would be the filename if the data is a file-like object.
If no <title>
or name
exists, this property will be None
.
HtmlParser.meta
The meta
property is a dict
containing data from <meta>
tags in the input. This
property is a dict
where each element has the form meta_name: meta_value
.
HtmlParser.content
Content is rendered into a list of blocks, ready to be created or appended to a page.
CSV Parser
The CSV parser will read comma-separate value content and generate the appropriate database along with content. In order to populate the database, the contents must be created as individual pages.
from notional.parser import CsvParser
parser = CsvParser(header_row=True)
with open(filename, "r") as fp:
parser.parse(fp)
doc = notion.databases.create(
parent=parent_page,
title=parser.title,
schema=parser.schema,
)
for props in parser.content:
page = notion.pages.create(
parent=db,
properties=props,
)
The CsvParser
accepts the follow configuration options when initialized:
header_row
- indicates that the input data has a header row, which will be used to generate the schema (defaults toTrue
)title_column
- indicates which column number to use as the title for entries (defaults to0
)
After parsing, the CsvParser
will contain title
, schema
, and content
.
CsvParser.title
The parser will attempt to read a name
property from the input data source. As seen
in the above example, this is a useful property when creating the database.
If there is no name
available, this property will be None
.
CsvParser.schema
The parser will generate a schema for the CSV data, which is used when creating the
database. The schema is presented as a dict
where each element is the form
field_name: field_type
and can be passed to the databases.create()
method.
CsvParser.content
CSV data is created as a list of page properties in the database. The content must be
created as separate pages with the new database parent. Specifically, the content
property is a list
where each element is a dict
of the form
field_name: field_value
. These elements are a full set of properties for creating a
new page.