arjuna.tpi.parser.xml module¶

Classes to assist in XML Parsing.

class arjuna.tpi.parser.xml.NodeLocator(*, tags: strOrSequence = None, text=None, attrs={}, **attr_kwargs)¶

Bases: object

Locator for finding an XML Node in an XmlNode.

Keyword Arguments

tags – (Optional) Descendant tags for the node. Can be a string of single or multiple tags or a list/tuple of tags.
text – Partial text content.
attrs – Arbitrary attributes as a dictionary. Use this when the attr names are not valid Python names.
**attr_kwargs – Arbitrary number of key value pairs representing attribute name and value. The values here will override those in attr_dict if there is an overlap of name(s).

Raises

Exception – If neither tag nor an attribute is provided.

Note

You can use tag and attributes in combination.

Supports nested node finding.

search_node(node: arjuna.tpi.parser.xml.XmlNode) → tuple¶: Search XmlNode objects that match this locator in the provided XmlNode object.

class arjuna.tpi.parser.xml.Xml¶

Bases: object

Helper class to create XmlNode objects.

classmethod from_file(file_path: str) → arjuna.tpi.parser.xml.XmlNode¶

Creates an XmlNode from file.

Parameters: file_path – Absolute path of the json file.
Returns: Arjuna’s XmlNode object

classmethod from_lxml_element(element, clone=False) → arjuna.tpi.parser.xml.XmlNode¶

Create an XmlNode from an lxml element.

Parameters: element – lxml element

classmethod from_str(xml_str)¶: Create an XmlNode from a string.

classmethod node_locator(*, tags: strOrSequence = None, text=None, attrs={}, **attr_kwargs)¶

Create a locator for finding an XML Node in an XmlNode.

Keyword Arguments

tags – (Optional) Descendant tags for the node. Can be a string of single or multiple tags or a list/tuple of tags.
text – Partial text content.
attrs – Arbitrary attributes as a dictionary. Use this when the attr names are not valid Python names.
**attr_kwargs – Arbitrary number of key value pairs representing attribute name and value. The values here will override those in attr_dict if there is an overlap of name(s).

Raises

Exception – If neither tag nor an attribute is provided.

Note

You can use tag and attributes in combination.

Supports nested node finding.

class arjuna.tpi.parser.xml.XmlNode(node)¶

Bases: object

Represents a single node in a parsed XML.

Parameters: node – lxml Element object.

as_str(normalize=False) → str¶

String representation of this node.

normalize: If True all new lines are removed and more than one conseuctive space is converted to a single space.

attr(name) → str¶: Value of an attribute of this node.

property attrs: arjuna.tpi.helper.arjtype.CIStringDict¶: All Attributes of this node as a dictionary.

property children: Tuple[arjuna.tpi.parser.xml.XmlNode]¶: All Children of this node as a Tuple of XmlNodes

clone() → arjuna.tpi.parser.xml.XmlNode¶: Create a clone of this XmlNode object.

find(*node_locators, strict: bool = False) → arjuna.tpi.parser.xml.XmlNode¶

Find first XmlNode that match one of more NodeLocator s.

Parameters: *node_locators – One or more NodeLocator s
Keyword Arguments: strict – If True, the call raises an exception if element is not found, else returns None

find_keyvalue_texts(key_locator, value_locator) → Tuple[str, str]¶

Returns texts of first XmlNodes for a pair of NodeLocator s

Parameters

key_locator – NodeLocator (for key)
value_locator – First NodeLocator (for value)

Returns

2-element tuple containing the text strings.

find_links(*, unique=True, contain='')¶

find_with_xpath(xpath, position=1)¶

Find nth XmlNode that matches an XPath.

Parameters

xpath – XPath string
position – XPath index. Default is 1.

findall(*node_locators, stop_when_matched: bool = False) → List[arjuna.tpi.parser.xml.XmlNode]¶

Find all XmlNodes that match one of more NodeLocator s.

Parameters: *node_locators – One or more NodeLocator s
Keyword Arguments: stop_when_matched – If True, the call returns nodes found by the first NodeLocator that locates one or more nodes. Default is False.
Returns: List of XmlNode s. In case of no match, empty list is returned.

findall_with_xpath(xpath) → List[arjuna.tpi.parser.xml.XmlNode]¶: Find all XmlNodes that match an XPath.

property following_sibling: arjuna.tpi.parser.xml.XmlNode¶: The XmlNode after this node at same hierarchial level.

get_inner_xml(normalize=False) → str¶

Inner XML of this node.

Keyword Arguments: normalize – If True, empty lines are removed between children nodes.

get_text(normalize: bool = False) → str¶

Text of this node.

Keyword Arguments: normalize – If True, all extra space is trimmed to a single space.

has_attr(name)¶: Check if an attribute is present.

property inner_xml: str¶: Unaltered inner XML of this node

property links: tuple¶: All links in the XML.

property node¶

Wrapped lxml Element

Not supposed to be used directly.

property normalized_inner_xml: str¶: Normalized inner XML of this node, with empty lines removed between children nodes.

property normalized_source: str¶: String representation of this node with all new lines removed and more than one conseuctive space converted to a single space.

property normalized_text: str¶: Text of this node with all extra space trimmed to a single space.

property parent: arjuna.tpi.parser.xml.XmlNode¶: Parent XmlNode

property preceding_sibling: arjuna.tpi.parser.xml.XmlNode¶: The XmlNode before this node at same hierarchial level.

remove_all_children() → None¶: Remove all children nodes from this node.

property source: str¶: Unalereted string representation of this node.

property tag: str¶: Tag of the node.

property text: str¶: Unaltered text of the node.

property texts: list¶: List of Texts of the node.

Note

Multiple texts are stored separately.

property unique_links: tuple¶: All unique links in the XML.

property value: str¶: Value of an ‘value’ attribute of this node.