Concurrency within the Document Creation Flow
One of TySE’s design goals is to have concurrency built in from the ground up. One focues area of improving performace through concurrency is the data flow starting with the internal document model all the way to a printable document.
The document creation flow is of course central to the TySE application. At its core lives an engine not unlike a web browser: a formatting engine based on HTML and CSS. Such an engine should present many opportunities for concurrent operations and data flows.
Overview
From a high level perspective the flow looks like this:
At some points the concurrent model will manifest in parallelizing small operations (e.g., traversing a style- or layout-tree), at some points a channels-model will be better suited. The latter is especially intuitive for text-flow into regions.
Tree Operations
Layout and styling of elements (in a web-browser as well as in TySE) is riddled with tree operations. The HTML/CSS API is in essence a set of functions on trees. Taking a close look at TySE we’ll indentify at least the following tree structures:
- HTML5 element tree
- CSS object model
- Tree of styled elements (style tree)
- Tree of boxes (layout tree)
- Render tree
TySE handles the traversal and manipulation of some of these trees in a concurrent fashion. This is done by introducing a base tree object that supports concurrent operations transparently.
Example: Let’s say we are looking for all the chapter headings in our document. The tree-method for traversing the styled tree will spawn a number of concurrent workers, which will do the traversal in a workload sharing pattern. As soon as a worker routine finds a chapter, it will put the node into a queue to collect all the results. Then the worker will continue to traverse other branches of the tree looking to match nodes for chapter headings.
The concurrent tree operations form a small DSL. With a predicate function defined like this
func isChapter(node *tree.Node) bool {
// indentify a chapter beginning in some way...
}
the above example would look like this (with an operation performSomething()
defined):
chapters := tree.ChildrenWith(isChapter).Filter(performSomething).Promise()
// now concurrent workers are doing their job
...
resultOfChapterTask := chapters() // collect all the work done
The result of such a call sequence is a “promise”, i.e. a future value. It will be the result of this chain of operations:
- We start at the tree root node.
- Then search (concurrently) for nodes matching
isChapter()
. - For each such node, perform
performSomething()
, which results in other nodes (or in modified versions of the input nodes). - The set of result nodes is a future value. This set will be returned by
calling the promise (
chapters()
in the example above).
Calling the promise is a synchronization point: The call will wait for all running coroutines to finish their work.
The concurrent base tree is used in TySE for the styled tree, the layout tree and the render tree. For an interesting discussion of trees and concurrency within the Firefox styling engine take a look at a brilliant blog entry by Lin Clark.
Channels
Channels are a unique feature of the Go programming language. On the other hand, channels are an easily understandable metaphor for transferring data through an application.
We will take a deeper look at a requirement for typesetting which we will called “region text flow”.