Table of contents
pyquery
required import
from pyquery import PyQuery as pq
read document
open document from file
document = pq(filename='/path/to/file')
open document from url
document = pq(url='[url]')
open document from raw html
document = pq('<html><h1>header</h1></html>')
extract data
query css selector
p = document("[selector]")
get outer html
print(p.outerHtml())
get inner html
print(p.html())
get contained text
print(p.text())
documentation
https://pyquery.readthedocs.io/en/latest/api.html#pyquery.pyquery.PyQuery