Example using XML source data from a URL#

This example (cansas.py) shows how content can be scraped from a URL that provides XML (using the lxml package) and written as a reST table. This particular XML uses a namespace which we setup in the variable nsmap:

 1#!/usr/bin/env python
 2
 3import io
 4import lxml.etree
 5import pyRestTable
 6import urllib.request
 7
 8# SVN_BASE_URL = "http://www.cansas.org/svn/1dwg/trunk"
 9CANSAS_URL = (
10    "https://raw.githubusercontent.com/"
11    "canSAS-org/1dwg/master/"
12    "examples/cs_af1410.xml"
13)
14
15
16def main():
17    nsmap = dict(cs="urn:cansas1d:1.1")
18
19    r = urllib.request.urlopen(CANSAS_URL).read().decode("utf-8")
20    doc = lxml.etree.parse(io.StringIO(r))
21
22    node_list = doc.xpath("//cs:SASentry", namespaces=nsmap)
23    t = pyRestTable.Table()
24    t.labels = ["SASentry", "description", "measurements"]
25    for node in node_list:
26        s_name, count = "", ""
27        subnode = node.find("cs:Title", namespaces=nsmap)
28        if subnode is not None:
29            s = lxml.etree.tostring(subnode, method="text")
30            s_name = node.attrib["name"]
31            count = len(node.xpath("cs:SASdata", namespaces=nsmap))
32        title = s.strip().decode()
33        t.rows += [[s_name, title, count]]
34
35    return t
36
37
38if __name__ == "__main__":
39    table = main()
40    # use "complex" since s_name might be empty string
41    print(table.reST(fmt="complex"))

The output from this code:

 110 SASentry elements in https://raw.githubusercontent.com/canSAS-org/1dwg/master/examples/cs_af1410.xml
 2
 3+-----------+--------------------------------------+--------------+
 4| entry         | description                          | measurements |
 5+===========+======================================+==============+
 6| AF1410:10 | AF1410-10 (AF1410 steel aged 10 h)   | 2            |
 7+-----------+--------------------------------------+--------------+
 8| AF1410:8h | AF1410-8h (AF1410 steel aged 8 h)    | 2            |
 9+-----------+--------------------------------------+--------------+
10| AF1410:qu | AF1410-qu (AF1410 steel aged 0.25 h) | 2            |
11+-----------+--------------------------------------+--------------+
12| AF1410:cc | AF1410-cc (AF1410 steel aged 100 h)  | 2            |
13+-----------+--------------------------------------+--------------+
14| AF1410:2h | AF1410-2h (AF1410 steel aged 2 h)    | 2            |
15+-----------+--------------------------------------+--------------+
16| AF1410:50 | AF1410-50 (AF1410 steel aged 50 h)   | 2            |
17+-----------+--------------------------------------+--------------+
18| AF1410:20 | AF1410-20 (AF1410 steel aged 20 h)   | 1            |
19+-----------+--------------------------------------+--------------+
20| AF1410:5h | AF1410-5h (AF1410 steel aged 5 h)    | 2            |
21+-----------+--------------------------------------+--------------+
22| AF1410:1h | AF1410-1h (AF1410 steel aged 1 h)    | 2            |
23+-----------+--------------------------------------+--------------+
24| AF1410:hf | AF1410-hf (AF1410 steel aged 0.5 h)  | 2            |
25+-----------+--------------------------------------+--------------+

The resulting table is shown:

10 SASentry elements in http://www.cansas.org/svn/1dwg/trunk/examples/cs_af1410.xml

entry

description

measurements

AF1410:10

AF1410-10 (AF1410 steel aged 10 h)

2

AF1410:8h

AF1410-8h (AF1410 steel aged 8 h)

2

AF1410:qu

AF1410-qu (AF1410 steel aged 0.25 h)

2

AF1410:cc

AF1410-cc (AF1410 steel aged 100 h)

2

AF1410:2h

AF1410-2h (AF1410 steel aged 2 h)

2

AF1410:50

AF1410-50 (AF1410 steel aged 50 h)

2

AF1410:20

AF1410-20 (AF1410 steel aged 20 h)

1

AF1410:5h

AF1410-5h (AF1410 steel aged 5 h)

2

AF1410:1h

AF1410-1h (AF1410 steel aged 1 h)

2

AF1410:hf

AF1410-hf (AF1410 steel aged 0.5 h)

2