Applying tree traversal algorithms to DOM

We've looked through few binary tree traversal techniques so far:

1- Traversing through the binary tree using recursive and iterative algorithms

2- Traversing through the binary tree using parent pointers

In this article, we'll put those learnings to use for an n-ary tree i.e. DOM. We'll see how we can locate DOM elements using various CSS selectors without using inbuilt APIs like getElementById, getElementsByClassname or querySelector/querySelectorAll. The article would thus also throw light on how these APIs might be working under the hood.

DOM traversal overview

Borrowing the idea from the first article, let's come up with the preOrder traversal algorithm for DOM:

function walkPreOrder(node){
  if(!node) return

  // do something here
  console.log(node)

  for(let child of node.children){
     walkPreOrder(child)
  }
}

We can modify this algorithm to return an iterator instead:

function* walkPreOrder(node){
  if(!node) return

  // do something here
  yield node
  for(let child of node.children){
    yield* walkPreOrder(child)
  }
}

// USAGE
for(let node of walkPreOrder(root)){
  console.log(node)
}

We can use any of the breadth-first or depth-first algorithms (discussed in previous articles) to traverse the DOM. For the sake of this article, we'll stick with the above approach though.

Let's also assume we're working on a document having following HTML:

<html>
  <head>
    <title>DOM selection algorithm</title>
  </head>
<body>

  <div class="container">
    <div class="body">
      <div class="row">
        <img id="profile" src="xyz.jpg" alt="">
      </div>
      <div class="row"></div>
      <div class="row"></div>
    </div>
  </div>

</body>
</html>

Locating a node by ID

Browsers offer document.getElementById() API to achieve this result. Using the walkPreOrder() helper it becomes really simple to achieve this. Let's see:

function locateById(nodeId){
  // iterate through all nodes in depth first (preOrder) fashion
  // return the node as soon as it's found
  for(let node of walkPreOrder(document.body)){
     if(node.id === nodeId){
        return node
     }
  }
   return null
}

We can use the locateById() function as follows:

const img = locateById('profile')
// returns the image node

Locating nodes by className

Browsers offer document.getElementsByClassName() API to achieve this result. Let's see how we can implement something similar:

function locateAllByClassName(className){
   const result = []
   for(let node of walkPreOrder(document.body)){
      if(node.classList.contains(className)){
        result.push(node)
      }
   }
   return result
}

// USAGE
const elements = locateAllByClassName('row')

How browser optimizes the selection queries

Selecting DOM nodes is a fairly common operation for web applications. Traversing through the tree multiple times for the same selector doesn't seem optimal. Browser optimizes the selection by using memoization.

Looking at mozilla parser's source, namely an excerpt from the function startTag:

 // ID uniqueness
 @IdType String id = attributes.getId();
 if (id != null) {
      LocatorImpl oldLoc = idLocations.get(id);
      if (oldLoc != null) {
            err("Duplicate ID \u201C" + id + "\u201D.");
            errorHandler.warning(new SAXParseException(
                  "The first occurrence of ID \u201C" + id
                  + "\u201D was here.", oldLoc));
       } else {
            idLocations.put(id, new LocatorImpl(tokenizer));
       }
 }

We can see those node ids are kept in a simple hash map. We can use a similar approach to ensure repeated queries for the same ID do not require full traversal, instead, we can just look it up from hashMap and return it.

Here's how our solution would look like post memoization:

function getSelectors(){
  const idLocations = {}
  const classLocations = {}

  // updated selector functions  
  function locateById(nodeId){
    if(idLocations.hasOwnProperty(nodeId)) 
       return idLocations[nodeId]

    for(let node of walkPreOrder(document.body)){
       if(node.id === nodeId){
          idLocations[nodeId]= node //memoize
          return node
       }
     }
    idLocations[nodeId]= null // memoize
    return null
  }

  function locateAllByClassName(className){
    if(classLocations.hasOwnProperty(className)) 
         return classLocations[className]

    const result = []
    for(let node of walkPreOrder(document.body)){
       if(node.classList.contains(className)){
          result.push(node)
        }
     }
     classLocations[nodeId]= result
     return result
  }

  return {
       locateById,
       locateAllByClassName
    }

} 

  // USAGE
  const {locateById, locateAllByClassName} = getSelectors();
  const result = locateAllByClassName('row') // returns array of elements
  const img = locateById('profile') // returns an element, if found

Dealing with more complex selectors

Let's try to implement something like element.querySelector. Here's how MDN describes it:

The querySelector() method of the Element interface returns the first element that is a descendant of the element on which it is invoked that matches the specified group of selectors.

Example:

const firstRow = document.querySelector('.container .row:first-child')

In this case we can pass any CSS selector to the function and it should be able to traverse the DOM to find that element for us. Let's see it we how it can be implemented:

// given a selector and root node, find that selector within the root node
function select(selector, root){
  for(let node of walkPreOrder(root)){
      if(node.matches(selector)){
        return node
     }
   }
  return null;
}


function myQuerySelector(path, node){
  // if path is empty, nothing to find
  if(path.length === 0) return null;

  // if node is not provided, let's assume user wants to search within document.body
  let root = node || document.body;  
  const selector = path[0];

  // if there's only one selector in the path, just traverse using select function above
  if(path.length === 1) return select(selector, root);

   // else, either the current node matches the first selector in path or not
   // if first selector matches with current node, look through it's children for subsequent selectors only
   // else, look through it's children for the whole path
  const newPath = root.matches(selector) ? path.slice(1): path;
  for(let child of root.children){
    const ans = myQuerySelector(newPath, child);
    if(ans) return ans
  }

  // nothing found
  return null;
}


// USAGE:
const firstRow = myQuerySelector([".container", ".row"])

Implementation of myQuerySelectorAll (similar to element.querySelectorAll) also follows the same approach with slight modification:

function selectAll(selector, root){
  let result = []
  for(let node of walkPreOrder(root)){
      if(node.matches(selector)){
        result.push(node)
     }
   }
  return result;
}

function myQuerySelectorAll(path, node){
  let result = [];
  if(path.length === 0) return result;

  let root = node || document.body;  
  const selector = path[0];

  if(path.length === 1) return selectAll(selector, root);

  const newPath = root.matches(selector) ? path.slice(1): path;
  for(let child of root.children){
    result = [...result, ...myQuerySelectorAll(newPath, child)]

  }

  return result;
}

Bonus

We can use the recursive preOrder traversal approach, describe at the start of this article, to clone any tree. Let's see how we can use it to clone any DOM tree, similar to what element.cloneNode(true) does:

Create a clone of the source node, by creating a new node with the same tagName and then copying over the attributes.
Recursively call the cloneTree method on all children of the source node, and append the returned nodes as children to the cloned node.

function cloneTree(node){
  if(!node) return

  const clonedNode = document.createElement(node.tagName.toLowerCase())
  const attributes = node.getAttributeNames()

  attributes.forEach(attribute => {
     clonedNode.setAttribute(attribute, node.getAttribute(attribute))
  })

  for(const child of node.children){
      clonedNode.append(cloneTree(child))
  }

  return clonedNode
}

Applying tree traversal algorithms to DOM

DOM traversal overview

Locating a node by ID

Locating nodes by className

How browser optimizes the selection queries

Dealing with more complex selectors

Bonus

Comments (1)

Basics to advanced: Tree data structure in JS

Trie in Javascript: the Data Structure behind Autocomplete

More from this blog

Implementing virtual scroll for web from scratch, in less than 150 lines of code

Graph data structure in Typescript

Heaps in JavaScript

Trie in Javascript: the Data Structure behind Autocomplete

HTTP Refresher: Things You Should Know About HTTP

Command Palette

DOM traversal overview

Locating a node by ID

Locating nodes by className

How browser optimizes the selection queries

Dealing with more complex selectors

Bonus

Comments (1)

Basics to advanced: Tree data structure in JS

Trie in Javascript: the Data Structure behind Autocomplete

More from this blog