Provides XML data management services for parsing, manipulation and serialisation.
The XML class is designed to provide robust functionality for creating, parsing and maintaining XML data structures. It supports both well-formed and loosely structured XML documents, offering flexible parsing behaviours to accommodate various XML formats. The class includes comprehensive support for XPath queries, content manipulation and document validation.
XML documents can be loaded into an XML object through multiple mechanisms:
The Path field allows loading from file system sources, with automatic parsing upon initialisation. The class supports LoadFile() caching for frequently accessed files, improving performance for repeated operations.
The Statement field enables direct parsing of XML strings, supporting dynamic content processing and in-memory document construction.
The Source field provides object-based input, allowing XML data to be sourced from any object supporting the Read action.
For batch processing scenarios, the Path or Statement fields can be reset post-initialisation, causing the XML object to automatically rebuild itself. This approach optimises memory usage by reusing existing object instances rather than creating new ones.
Successfully parsed XML data is accessible through the Tags field, which contains a hierarchical array of XMLTag structures. Each XMLTag represents a complete XML element including its attributes, content and child elements. The structure maintains the original document hierarchy, enabling both tree traversal and direct element access.
C++ developers benefit from direct access to the Tags field, represented as pf::vector<XMLTag>
. This provides efficient iteration and element access with standard STL semantics. However, direct modification of the Tags array is discouraged as it can destabilise internal object state - developers should use the provided methods for safe manipulation.
The XML class consists of the following fields:
Access | Name | Type | Comment | ||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Flags | XMF | Controls XML parsing behaviour and processing options. | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||
Modified | INT | A timestamp of when the XML data was last modified. | |||||||||||||||||||||||||||||||||||
The Modified field provides an artificial timestamp value of when the XML data was last modified (e.g. by a tag insert or update). Storing the current Modified value and making comparisons later makes it easy to determine that a change has been made. A rough idea of the total number of change requests can also be calculated by subtracting out the difference. | |||||||||||||||||||||||||||||||||||||
Path | STRING | Set this field if the XML document originates from a file source. | |||||||||||||||||||||||||||||||||||
XML documents can be loaded from the file system by specifying a file path in this field. If set post-initialisation, all currently loaded data will be cleared and the file will be parsed automatically. The XML class supports LoadFile(), so an XML file can be pre-cached by the program if it is frequently used during a program's life cycle. | |||||||||||||||||||||||||||||||||||||
ReadOnly | INT | Prevents modifications and enables caching for a loaded XML data source. | |||||||||||||||||||||||||||||||||||
This field can be set to | |||||||||||||||||||||||||||||||||||||
Source | OBJECTPTR | Set this field if the XML data is to be sourced from another object. | |||||||||||||||||||||||||||||||||||
An XML document can be loaded from another object by referencing it here, on the condition that the object's class supports the Read action. If set post-initialisation, all currently loaded data will be cleared and the source object will be parsed automatically. | |||||||||||||||||||||||||||||||||||||
Start | INT | Set a starting cursor to affect the starting point for some XML operations. | |||||||||||||||||||||||||||||||||||
When using any XML function that creates an XML string (e.g. SaveToObject), the XML object will include the entire XML tree by default. Defining the Start value will restrict processing to a specific tag and its children. The Start field currently affects the SaveToObject() action and the Statement field. | |||||||||||||||||||||||||||||||||||||
Statement | STRING | XML data is processed through this field. | |||||||||||||||||||||||||||||||||||
Set the Statement field to parse an XML formatted data string through the object. If this field is set after initialisation then the XML object will clear any existing data first. Be aware that setting this field with an invalid statement will result in an empty XML object. Reading the Statement field will return a serialised string of XML data. By default all tags will be included in the statement unless a predefined starting position is set by the Start field. The string result is an allocation that must be freed. | |||||||||||||||||||||||||||||||||||||
Tags | STRUCT [] | Provides direct access to the XML document structure. | |||||||||||||||||||||||||||||||||||
The Tags field exposes the complete XML document structure as a hierarchical array of XMLTag structures. This field becomes available after successful XML parsing and provides the primary interface for reading XML content programmatically. Each XMLTag will have at least one attribute set in the Direct read access to the Tags hierarchy is safe and efficient for traversing the document structure. However, modifications should be performed using the XML object's methods (InsertXML(), SetAttrib(), RemoveTag(), etc.) to maintain internal consistency and trigger appropriate cache invalidation. |
The following actions are currently supported:
Clear | Completely clears all XML data and resets the object to its initial state. | |||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ERR acClear(*Object) The Clear action removes all parsed XML content from the object, including the complete tag hierarchy, cached data structures and internal state information. This action effectively returns the XML object to its freshly-initialised condition, ready to accept new XML data. | ||||||||||||||||||||||||||||||||
DataFeed | Processes and integrates external XML data into the object's document structure. | |||||||||||||||||||||||||||||||
ERR acDataFeed(*Object, OBJECTID Object, DATA Datatype, APTR Buffer, INT Size)
The DataFeed action provides a mechanism for supplying XML content to the object from external sources or streaming data. This action supports both complete document replacement and incremental content addition, depending on the current state of the XML object. The action accepts data in XML or plain text format and automatically performs parsing and integration. When the object contains no existing content, the provided data becomes the complete document structure. If the object already contains parsed XML, the new data is parsed separately and appended to the existing tag hierarchy. If the provided data contains malformed XML or cannot be parsed according to the current validation settings, the action will return appropriate error codes without modifying the existing document structure. This ensures that partial parsing failures do not corrupt previously loaded content. Attempts to feed data into a read-only XML object will be rejected to maintain document integrity. | ||||||||||||||||||||||||||||||||
GetKey | Retrieves data from an xml object. | |||||||||||||||||||||||||||||||
ERR acGetKey(*Object, CSTRING Key, STRING Value, INT Size)
The XML class uses key-values for the execution of XPath queries. Documentation of the XPath standard is out of the scope for this document, however the following examples illustrate the majority of uses for this query language and a number of special instructions that we support:
The | ||||||||||||||||||||||||||||||||
Reset | Clears the information held in an XML object. | |||||||||||||||||||||||||||||||
SaveToObject | Saves XML data to a storage object (e.g. File). | |||||||||||||||||||||||||||||||
ERR acSaveToObject(*Object, OBJECTID Dest, CLASSID ClassID)
| ||||||||||||||||||||||||||||||||
SetKey | Sets attributes and content in the XML tree using XPaths. | |||||||||||||||||||||||||||||||
ERR acSetKey(*Object, CSTRING Key, CSTRING Value)
Use SetKey to add tag attributes and content using XPaths. The XPath is specified in the It is not possible to add new tags using this action - it is only possible to update existing tags. Please note that making changes to the XML tree will render all previously obtained tag pointers and indexes invalid. Error Codes
|
The following methods are currently supported:
Count | Count all tags that match a given XPath expression. | ||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ERR xml::Count(OBJECTPTR Object, CSTRING XPath, INT * Result)
This method will count all tags that match a given Error Codes
| |||||||||||||||||||||||
Filter | Filters the XML data structure to retain only a specific tag and its descendants. | ||||||||||||||||||||||
ERR xml::Filter(OBJECTPTR Object, CSTRING XPath)
The Filter method provides a mechanism for reducing large XML documents to a specific subtree, permanently removing all content that exists outside the targeted element and its children. This operation is particularly valuable for performance optimisation when working with large documents where only a specific section is relevant. The filtering process begins by locating the target element using the provided XPath expression. Once found, a new XML structure is created containing only the matched tag and its complete descendant hierarchy. All sibling tags, parent elements (excluding the direct lineage) and unrelated branches are permanently discarded. Error Codes
| |||||||||||||||||||||||
FindTag | Searches for XML elements using XPath expressions with optional callback processing. | ||||||||||||||||||||||
ERR xml::FindTag(OBJECTPTR Object, CSTRING XPath, FUNCTION * Callback, INT * Result)
The FindTag method provides the primary mechanism for locating XML elements within the document structure using XPath 1.0 compatible expressions. The method supports both single-result queries and comprehensive tree traversal with callback-based processing for complex operations. The method supports comprehensive XPath syntax including absolute paths, attribute matching, content matching (a Parasol extension), wildcarding, deep scanning with double-slash notation, indexed access and complex expressions with multiple criteria. When no callback function is provided, FindTag returns the first matching element and terminates the search immediately. This is optimal for simple queries where only the first occurrence is required. When a callback function is specified, FindTag continues searching through the entire document structure, calling the provided function for each matching element. This enables comprehensive processing of all matching elements in a single traversal. The C++ prototype for the callback function is The callback should return Error Codes
| |||||||||||||||||||||||
GetAttrib | Retrieves the value of a specific XML attribute from a tagged element. | ||||||||||||||||||||||
ERR xml::GetAttrib(OBJECTPTR Object, INT Index, CSTRING Attrib, CSTRING * Value)
The GetAttrib method provides efficient access to individual attribute values within XML elements. Given a tag identifier and attribute name, the method performs a case-insensitive search through the element's attribute collection and returns the corresponding value. When a specific attribute name is provided, the method searches through all attributes of the target tag. The search is case-insensitive to accommodate XML documents with varying capitalisation conventions. When the attribute name is NULL or empty, the method returns the tag name itself, providing convenient access to element names without requiring separate API calls. Performance ConsiderationsFor applications requiring frequent attribute access or high-performance scenarios, C++ developers should consider direct access to the XMLAttrib structure array. This bypasses the method call overhead and provides immediate access to all attributes simultaneously. The method performs a linear search through the attribute collection, so performance scales with the number of attributes per element. For elements with many attributes, caching frequently accessed values may improve performance. Data IntegrityThe returned string pointer references internal XML object memory and remains valid until the XML structure is modified. Callers should not attempt to modify or free the returned string. For persistent storage, the string content should be copied to application-managed memory. Error Codes
| |||||||||||||||||||||||
GetContent | Extracts the immediate text content of an XML element, excluding nested tags. | ||||||||||||||||||||||
ERR xml::GetContent(OBJECTPTR Object, INT Index, STRING Buffer, INT Length)
The GetContent method provides efficient extraction of text content from XML elements using a shallow parsing approach. It retrieves only the immediate text content of the specified element, deliberately excluding any text contained within nested child elements. This behaviour is valuable for scenarios requiring precise content extraction without recursive tag processing. Consider the following XML structure: <body> Hello <bold>emphasis</bold> world! </body> The GetContent method would extract Comparison with Deep ExtractionFor scenarios requiring complete text extraction including all nested content, use the Serialise() method with appropriate flags to perform deep content analysis. The GetContent method is optimised for cases where nested tag content should be excluded from the result. If the resulting content exceeds the buffer capacity, the result will be truncated but remain null-terminated. It is recommended that C++ programs bypass this method and access the XMLAttrib structure directly. Error Codes
| |||||||||||||||||||||||
GetTag | Returns a pointer to the XMLTag structure for a given tag index. | ||||||||||||||||||||||
ERR xml::GetTag(OBJECTPTR Object, INT Index, struct XMLTag ** Result)
This method will return the XMLTag structure for a given tag Error Codes
| |||||||||||||||||||||||
InsertContent | Inserts text content into the XML document structure at specified positions. | ||||||||||||||||||||||
ERR xml::InsertContent(OBJECTPTR Object, INT Index, XMI Where, CSTRING Content, INT * Result)
The InsertContent method will insert content strings into any position within the XML tree. A content string must be provided in the To modify existing content, call SetAttrib() instead. Error Codes
| |||||||||||||||||||||||
InsertXML | Parse an XML string and insert it in the XML tree. | ||||||||||||||||||||||
ERR xml::InsertXML(OBJECTPTR Object, INT Index, XMI Where, CSTRING XML, INT * Result)
The InsertXML() method is used to translate and insert a new set of XML tags into any position within the XML tree. A standard XML statement must be provided in the XML parameter and the target insertion point is specified in the Index parameter. An insertion point relative to the target index must be specified in the Error Codes
| |||||||||||||||||||||||
InsertXPath | Inserts an XML statement in an XML tree. | ||||||||||||||||||||||
ERR xml::InsertXPath(OBJECTPTR Object, CSTRING XPath, XMI Where, CSTRING XML, INT * Result)
The InsertXPath method is used to translate and insert a new set of XML tags into any position within the XML tree. A standard XML statement must be provided in the XML parameter and the target insertion point is referenced as a valid Error Codes
| |||||||||||||||||||||||
MoveTags | Move an XML tag group to a new position in the XML tree. | ||||||||||||||||||||||
ERR xml::MoveTags(OBJECTPTR Object, INT Index, INT Total, INT DestIndex, XMI Where)
This method is used to move XML tags within the XML tree structure. It supports the movement of single and groups of tags from one index to another. The client must supply the index of the tag that will be moved and the index of the target tag. All child tags of the source will be included in the move. An insertion point relative to the target index must be specified in the Error Codes
| |||||||||||||||||||||||
RemoveTag | Removes tag(s) from the XML structure. | ||||||||||||||||||||||
ERR xml::RemoveTag(OBJECTPTR Object, INT Index, INT Total)
The RemoveTag method is used to remove one or more tags from an XML structure. Child tags will automatically be discarded as a consequence of using this method, in order to maintain a valid XML structure. This method is capable of deleting multiple tags if the This method is volatile and will destabilise any cached address pointers that have been acquired from the XML object. Error Codes
| |||||||||||||||||||||||
RemoveXPath | Removes tag(s) from the XML structure, using an xpath lookup. | ||||||||||||||||||||||
ERR xml::RemoveXPath(OBJECTPTR Object, CSTRING XPath, INT Limit)
The RemoveXPath method is used to remove one or more tags from an XML structure. Child tags will automatically be discarded as a consequence of using this method, in order to maintain a valid XML structure. Individual tag attributes can also be removed if an attribute is referenced at the end of the The removal routine will be repeated so that each tag that matches the XPath will be deleted, or the This method is volatile and will destabilise any cached address pointers that have been acquired from the XML object. Error Codes
| |||||||||||||||||||||||
Serialise | Serialise part of the XML tree to an XML string. | ||||||||||||||||||||||
ERR xml::Serialise(OBJECTPTR Object, INT Index, XMF Flags, STRING * Result)
The Serialise() method will serialise all or part of the XML data tree to a string. The string will be allocated as a memory block and stored in the Result parameter. It must be freed once the data is no longer required. Error Codes
| |||||||||||||||||||||||
SetAttrib | Adds, updates and removes XML attributes. | ||||||||||||||||||||||
ERR xml::SetAttrib(OBJECTPTR Object, INT Index, XMS Attrib, CSTRING Name, CSTRING Value)
This method is used to update and add attributes to existing XML tags, as well as adding or modifying content. The data for the attribute is defined in the NOTE: The attribute at position 0 declares the name of the tag and should not normally be accompanied with a value declaration. However, if the tag represents content within its parent, then the Name must be set to Error Codes
| |||||||||||||||||||||||
Sort | Sorts XML tags to your specifications. | ||||||||||||||||||||||
ERR xml::Sort(OBJECTPTR Object, CSTRING XPath, CSTRING Sort, XSF Flags)
The Sort method is used to sort a single branch of XML tags in ascending or descending order. An The Error Codes
|
Standard flags for the XML class.
Name | Description |
---|---|
XMF::INCLUDE_COMMENTS | By default, comments are stripped when parsing XML input unless this flag is specified. |
XMF::INCLUDE_SIBLINGS | Include siblings when building an XML string (GetXMLString() only) |
XMF::INCLUDE_WHITESPACE | By default the XML parser will trim whitespace (such as return codes, spaces and tabs) found in the XML content between tags. Setting this flag turns off this feature, allowing all whitespace to be included. |
XMF::INDENT | Indent the output of serialised XML to improve readability. |
XMF::LOCK_REMOVE | Prevents removal of tags from the XML tree. This specifically affects the RemoveTag and RemoveXPath methods. |
XMF::LOG_ALL | Print extra log messages. |
XMF::NEW | Creates an empty XML object on initialisation - if the Path field has been set, the source file will not be loaded. |
XMF::NO_ESCAPE | Turns off escape code conversion. |
XMF::OMIT_TAGS | Prevents tags from being output when the XML is serialised (output content only). |
XMF::PARSE_ENTITY | Entity references in the DTD will be parsed automatically. |
XMF::PARSE_HTML | Automatically parse HTML escape codes. |
XMF::READABLE | Indent the output of serialised XML to improve readability. |
XMF::STRIP_CDATA | Do not echo CDATA sections. Note that this option is used as a parameter, not an object flag. |
XMF::STRIP_CONTENT | Strip all content from incoming XML data. |
XMF::STRIP_HEADERS | XML headers found in the source data will not be included in the parsed results. |
XMF::WELL_FORMED | By default, the XML class will accept badly structured XML data. This flag requires that XML statements must be well-formed (tags must balance) or an ERR::BadData error will be returned during processing. |
Tag insertion options.
Name | Description |
---|---|
XMI::CHILD | Insert as the first child of the target. |
XMI::CHILD_END | Insert as the last child of the target. |
XMI::NEXT | Insert as the next tag of the target. |
XMI::PREV | Insert as the previous tag of the target. |
For SetAttrib()
Name | Description |
---|---|
XMS::NEW | Adds a new attribute. Note that if the attribute already exists, this will result in at least two attributes of the same name in the tag. |
XMS::UPDATE | As for UPDATE_ONLY , but if the attribute does not exist, it will be created. |
XMS::UPDATE_ONLY | SetAttrib will find the target attribute and update it. It is not possible to rename the attribute when using this technique. ERR::Search is returned if the attribute cannot be found. |
Options for the Sort method.
Name | Description |
---|---|
XSF::CHECK_SORT | Tells the algorithm to check for a 'sort' attribute in each analysed tag and if found, the algorithm will use that as the sort value instead of that indicated in the Attrib field. |
XSF::DESC | Sort in descending order. |
Field | Type | Description |
---|---|---|
Name | std::string | Name of the attribute |
Value | std::string | Value of the attribute |
Field | Type | Description |
---|---|---|
ID | INT | Unique ID assigned to the tag on creation |
ParentID | INT | Unique ID of the parent tag |
LineNo | INT | Line number on which this tag was encountered |
Flags | XTF | Optional flags |
Attribs | pf::vector<XMLAttrib> | Array of attributes for this tag |
Children | pf::vector<XMLTag> | Array of child tags |