Example using XML source data from a URLΒΆ
This example (cansas.py
) shows how content can be scraped from a URL that
provides XML (using the lxml package) and written as a reST table. This
particular XML uses a namespace which we setup in the variable nsmap
:
1#!/usr/bin/env python
2
3import io
4import lxml.etree
5import pyRestTable
6import urllib.request
7
8# SVN_BASE_URL = "http://www.cansas.org/svn/1dwg/trunk"
9CANSAS_URL = (
10 "https://raw.githubusercontent.com/"
11 "canSAS-org/1dwg/master/"
12 "examples/cs_af1410.xml"
13)
14
15
16def main():
17 nsmap = dict(cs="urn:cansas1d:1.1")
18
19 r = urllib.request.urlopen(CANSAS_URL).read().decode("utf-8")
20 doc = lxml.etree.parse(io.StringIO(r))
21
22 node_list = doc.xpath("//cs:SASentry", namespaces=nsmap)
23 t = pyRestTable.Table()
24 t.labels = ["SASentry", "description", "measurements"]
25 for node in node_list:
26 s_name, count = "", ""
27 subnode = node.find("cs:Title", namespaces=nsmap)
28 if subnode is not None:
29 s = lxml.etree.tostring(subnode, method="text")
30 s_name = node.attrib["name"]
31 count = len(node.xpath("cs:SASdata", namespaces=nsmap))
32 title = s.strip().decode()
33 t.rows += [[s_name, title, count]]
34
35 return t
36
37
38if __name__ == "__main__":
39 table = main()
40 # use "complex" since s_name might be empty string
41 print(table.reST(fmt="complex"))
The output from this code:
110 SASentry elements in https://raw.githubusercontent.com/canSAS-org/1dwg/master/examples/cs_af1410.xml
2
3+-----------+--------------------------------------+--------------+
4| entry | description | measurements |
5+===========+======================================+==============+
6| AF1410:10 | AF1410-10 (AF1410 steel aged 10 h) | 2 |
7+-----------+--------------------------------------+--------------+
8| AF1410:8h | AF1410-8h (AF1410 steel aged 8 h) | 2 |
9+-----------+--------------------------------------+--------------+
10| AF1410:qu | AF1410-qu (AF1410 steel aged 0.25 h) | 2 |
11+-----------+--------------------------------------+--------------+
12| AF1410:cc | AF1410-cc (AF1410 steel aged 100 h) | 2 |
13+-----------+--------------------------------------+--------------+
14| AF1410:2h | AF1410-2h (AF1410 steel aged 2 h) | 2 |
15+-----------+--------------------------------------+--------------+
16| AF1410:50 | AF1410-50 (AF1410 steel aged 50 h) | 2 |
17+-----------+--------------------------------------+--------------+
18| AF1410:20 | AF1410-20 (AF1410 steel aged 20 h) | 1 |
19+-----------+--------------------------------------+--------------+
20| AF1410:5h | AF1410-5h (AF1410 steel aged 5 h) | 2 |
21+-----------+--------------------------------------+--------------+
22| AF1410:1h | AF1410-1h (AF1410 steel aged 1 h) | 2 |
23+-----------+--------------------------------------+--------------+
24| AF1410:hf | AF1410-hf (AF1410 steel aged 0.5 h) | 2 |
25+-----------+--------------------------------------+--------------+
The resulting table is shown:
10 SASentry elements in http://www.cansas.org/svn/1dwg/trunk/examples/cs_af1410.xml
entry |
description |
measurements |
---|---|---|
AF1410:10 |
AF1410-10 (AF1410 steel aged 10 h) |
2 |
AF1410:8h |
AF1410-8h (AF1410 steel aged 8 h) |
2 |
AF1410:qu |
AF1410-qu (AF1410 steel aged 0.25 h) |
2 |
AF1410:cc |
AF1410-cc (AF1410 steel aged 100 h) |
2 |
AF1410:2h |
AF1410-2h (AF1410 steel aged 2 h) |
2 |
AF1410:50 |
AF1410-50 (AF1410 steel aged 50 h) |
2 |
AF1410:20 |
AF1410-20 (AF1410 steel aged 20 h) |
1 |
AF1410:5h |
AF1410-5h (AF1410 steel aged 5 h) |
2 |
AF1410:1h |
AF1410-1h (AF1410 steel aged 1 h) |
2 |
AF1410:hf |
AF1410-hf (AF1410 steel aged 0.5 h) |
2 |