MiniDOM is a minimal implementation of the Document Object Model interface. It is intended to be simpler than the full DOM but full-featured enough to be useful in most applications.
MiniDOM is fully documented and unit tested. It can be used on iOS, macOS, watchOS, and tvOS. The library is released under the MIT license.
To parse an XML document, simply create a Parser object and call parse():
import Foundation
import MiniDOM
func parseXML(url: URL) -> Document? {
let parser = Parser(contentsOf: url)
let result = parser?.parse()
return result?.document
}
The resulting structure is a tree of objects implementing the Node protocol: Document, Element, Text, ProcessingInstruction, Comment, and CDATASection. Accessor methods and properties are provided that are similar to those in the DOM specification. DOM trees can be traversed using search methods, path-evaluation methods, or using the visitor design pattern. Each of these will be discussed in detail below.
Installing
MiniDOM supports installation via the CocoaPods, Carthage, and the Swift Package Manager.
CocoaPods
Add the following to your Podfile:
pod 'MiniDOM'
Carthage
Add the following to your Cartfile:
github "MiniDOM/MiniDOM"
Swift Package Manager
Add the following dependency to your Package.swift file:
MiniDOM has no third-party dependencies. It only uses Foundation classes, including XMLParser. Unit tests are written using XCUnit.
Path evaluation
MiniDOM provides a mechanism for traversing a document via a path. Call Document.evaluate(path:), passing an array of strings representing element node names (Node.nodeName). For example, consider the following document:
Evaluating the path ["a", "b", "z"] (by calling document.evaluate(path: ["a", "b", "z"])) will return an array of two Element objects representing the <z> elements with ID 3 and 9.
Visitor design pattern
The Visitor Design Pattern is used throughout the MiniDOM library to implement algorithms that involve traversing the DOM tree. It provides a convenient mechanism to separate an algorithm from the object structure on which it operates. It allows operations to be added to the DOM structure without modifying the structures themselves.
A Visitor object is provided to Node.accept(_:) to start the traversal. The Node object calls the appropriate methods on the Visitor object before calling Node.accept(_:) on its child nodes, performing the recursive traversal.
The Visitor protocol defines methods that correspond to each of the Node types in the DOM. Types implementing the Visitor protocol do not need to deal with the actual traversal; its methods are called by the traversal algorithm provided by the DOM classes.
For a simple example of a visitor, see the ElementSearch class in Search.swift. For a more complex example of a visitor, see the PrettyPrinter class in Formatter.swift.
Example
The following is taken from MiniDOM.playground in the root of the project. Feel free to open it up and experiment on your own.
Parsing a document
We have an XML document saved in the resources section of the playground. It contains a snapshot of the EFF Updates RSS feed. We’ll begin by parsing the document.
let url = Bundle.main.url(forResource: "eff-updates", withExtension: "rss")!
let parser = Parser(contentsOf: url)
let document = parser?.parse().document
Let’s begin by getting the document element or root node of the document.
let rss = document?.documentElement
rss?.nodeName
Result
"rss"
The <rss> element should have one child: a <channel> element.
let channel = rss?.firstChildElement
channel?.nodeName
Result
"channel"
The <channel> element should have 50 <item> children.
let items = channel?.childElements(withName: "item")
items?.count
Result
50
Each of the <item> elements should have a <title> child.
let itemTitles = items?.flatMap { itemElement -> String? in
let titleElement = itemElement.childElements(withName: "title").first
return titleElement?.textValue
}
itemTitles
Result
0 "Stupid Patent of the Month: Storing Files in Folders"
1 "NAFTA Renegotiation Will Resurrect Failed TPP Proposals"
2 "New Report Aims to Help Criminal Defense Attorneys Challenge Secretive Government Hacking"
3 "The Most Powerful Single Click in Your Facebook Privacy Settings"
4 "Repealing Broadband Privacy Rules, Congress Sides with the Cable and Telephone Industry"
...
There are <link> elements that are children of the <channel> element, and that are children of each of the <item> elements. We can find all of them.
let linkElementsFromDocument = document?.elements(withTagName: "link")
let linkURLsFromDocument = linkElementsFromDocument?.flatMap { $0.textValue }
linkURLsFromDocument
The <item> children of the <channel> element should each have a <link> child. Using a path expression, we can collect all of the text children of the <link> elements under the <channel> element.
let linkTextNodesViaPath = document?.evaluate(path: ["rss", "channel", "item", "link", "#text"])
let linkURLsViaPath = linkTextNodesViaPath?.flatMap { $0.nodeValue }
linkURLsViaPath
We can collect all of the <title> elements in the document using a visitor.
class TitleCollector: Visitor {
var titles: [String] = []
public func beginVisit(_ element: Element) {
if element.tagName == "title", let title = element.textValue {
titles.append(title)
}
}
}
let titleCollector = TitleCollector()
document?.accept(titleCollector)
titleCollector.titles
Result
0 "Deeplinks"
1 "Stupid Patent of the Month: Storing Files in Folders"
2 "NAFTA Renegotiation Will Resurrect Failed TPP Proposals"
3 "New Report Aims to Help Criminal Defense Attorneys Challenge Secretive Government Hacking"
4 "The Most Powerful Single Click in Your Facebook Privacy Settings"
5 "Repealing Broadband Privacy Rules, Congress Sides with the Cable and Telephone Industry"
MiniDOM: Minimal XML DOM for Swift
Introduction
MiniDOM is a minimal implementation of the Document Object Model interface. It is intended to be simpler than the full DOM but full-featured enough to be useful in most applications.
MiniDOM is fully documented and unit tested. It can be used on iOS, macOS, watchOS, and tvOS. The library is released under the MIT license.
To parse an XML document, simply create a
Parser
object and callparse()
:The resulting structure is a tree of objects implementing the
Node
protocol:Document
,Element
,Text
,ProcessingInstruction
,Comment
, andCDATASection
. Accessor methods and properties are provided that are similar to those in the DOM specification. DOM trees can be traversed using search methods, path-evaluation methods, or using the visitor design pattern. Each of these will be discussed in detail below.Installing
MiniDOM supports installation via the CocoaPods, Carthage, and the Swift Package Manager.
CocoaPods
Add the following to your
Podfile
:Carthage
Add the following to your
Cartfile
:Swift Package Manager
Add the following dependency to your
Package.swift
file:Dependencies
MiniDOM has no third-party dependencies. It only uses
Foundation
classes, includingXMLParser
. Unit tests are written usingXCUnit
.Path evaluation
MiniDOM provides a mechanism for traversing a document via a path. Call
Document.evaluate(path:)
, passing an array of strings representing element node names (Node.nodeName
). For example, consider the following document:Evaluating the path
["a", "b", "z"]
(by callingdocument.evaluate(path: ["a", "b", "z"])
) will return an array of twoElement
objects representing the<z>
elements with ID3
and9
.Visitor design pattern
The Visitor Design Pattern is used throughout the MiniDOM library to implement algorithms that involve traversing the DOM tree. It provides a convenient mechanism to separate an algorithm from the object structure on which it operates. It allows operations to be added to the DOM structure without modifying the structures themselves.
A
Visitor
object is provided toNode.accept(_:)
to start the traversal. TheNode
object calls the appropriate methods on theVisitor
object before callingNode.accept(_:)
on its child nodes, performing the recursive traversal.The
Visitor
protocol defines methods that correspond to each of theNode
types in the DOM. Types implementing theVisitor
protocol do not need to deal with the actual traversal; its methods are called by the traversal algorithm provided by the DOM classes.For a simple example of a visitor, see the
ElementSearch
class inSearch.swift
. For a more complex example of a visitor, see thePrettyPrinter
class inFormatter.swift
.Example
The following is taken from
MiniDOM.playground
in the root of the project. Feel free to open it up and experiment on your own.Parsing a document
We have an XML document saved in the resources section of the playground. It contains a snapshot of the EFF Updates RSS feed. We’ll begin by parsing the document.
Walking through the document
The document’s structure is something like this:
Let’s begin by getting the document element or root node of the document.
Result
The
<rss>
element should have one child: a<channel>
element.Result
The
<channel>
element should have 50<item>
children.Result
Each of the
<item>
elements should have a<title>
child.Result
There are
<link>
elements that are children of the<channel>
element, and that are children of each of the<item>
elements. We can find all of them.Result
Path evaluation
The
<item>
children of the<channel>
element should each have a<link>
child. Using a path expression, we can collect all of the text children of the<link>
elements under the<channel>
element.Result
Visitor
We can collect all of the
<title>
elements in the document using a visitor.Result
Issues and Contributions
Please report any issues you find.
Pull requests are welcome. Please make sure any additions are documented and unit tested. We aim to maintain 100% documentation and test coverage.