Down The Sitecore Query Hole

I believe there is still use for the good old Query.SelectItems. You can do a lot with the content search API and you can do it a lot faster no doubt, but there are cases where you might still decide to use the good old Query.

Anyway, even if only to have some fun exploring, I invite you on a journey down the Sitecore Query Hole. It is not deep so it won’t take too long to get to the bottom of it. We’ll see about the climbing out part.

Trailhead

We will need XUnit and Sitecore.FakeDb for isolation. Our starting point is a Sitecore database with a home item under /sitecore/content. The home has two child items:

  public class QueryHole
  {
    private readonly Db db;

    public QueryHole()
    {
      db = new Db
      {
          new DbItem("home") { new DbItem("child1"), new DbItem("child2") }
      };
    }

    // The Holes
  }

Holes

If we look where the light is we won’t find it. The following queries work as you would expect:

    [Theory]
    [InlineData("/sitecore/content/home")]
    [InlineData("/sitecore/content/*")]
    [InlineData("/sitecore/content/*[@@key = 'home']")]
    public void ShouldWorkAsExpected(string query)
    {
      ...
    }

A refining predicate, however, throws a ParseException:

    [Theory]
    [InlineData("/sitecore/content//*[contains(@@key, 'child')][1]")]
    public void FoundTheFirstHole(string query)
    {
      using (new DatabaseSwitcher(db.Database))
      {
        Assert.Throws<ParseException>(() => Query.SelectItems(query));
      }
    }

And here’s another one. If you worked with XPath you would probably agree that parent:: axis, and axes in general, behave a little strange:

    [Theory]
    [InlineData("/sitecore/content/home/child1/parent::*")]
    [InlineData("/sitecore/content/home/child1/parent::*/*")]
    [InlineData("/sitecore/content/home/child1/parent::home/*")]
    [InlineData("/sitecore/content/home/child1/parent::idontexist/child::*")]
    public void FoundTheSecondHole(string query)
    {
      using (new DatabaseSwitcher(db.Database))
      {
        Item[] result = Query.SelectItems(query);

        result.Should().HaveCount(2);
        result.Should().Contain(r => r.Name == "child1");
        result.Should().Contain(r => r.Name == "child2");
      }
    }

And just to make a hole a little deeper, the /parent::* at the end of the first query is not equivalent to /.. (which would actually return the parent item) and is not equivalent to /parent::home (which would return null):

Gear up! We are climbing down.

Opcode

The first step of executing a query is parsing it. Sitecore tokenizes and parses queries into a series of Opcodes. An opcode can be a step (e.g. // translates to a Descendant opcode), can be an operator (e.g. > translates to a GreaterOperator opcode), can be an operand (e.g. literal, number, boolean value), can be a function, etc.

The Predicates

To climb out of the first hole we need to understand how query parser treats predicates.

A predicate is an Opcode object by birth, a step opcode to be more precise – this is what a predicate is parsed into. A step opcode can have a next step attached to it so technicaly a predicate could have another predicate as its next step. This is not how predicates are handled though.

A predicate, logically, is a refinment filter. In a /sitecore/content/*[contains(...)] query the contains function should filter out content‘s child items that don’t satisfy the expression in the predicate. When Sitecore parses the predicate it creates a step opcode object but it doesn’t register it as a step per se. A predicate gets attached to the preceding element step as an attribute, a filter in a way.

I imagine that parsing predicate as a real step would be more complicated than registering it as a filter – tokenization would have to not only look at slashes but also interpret a [] sequence as a step separator. And that’s why, I believe, in the Sitecore Query world a predicate is an attribtue of the element step. And it happens to be a scalar value, not a list, hence only one predicate per step. It would probably be not too hard to recurse the QueryParser.GetPredicate() method and make it return a list of predicates but that’s not how it’s implemented.

We’re out of the first hole.

Two contiguous predicates are not the same as two expressions connected with an and. Compare [contains(@@key, '2')][1] and [contains(@@key, '2') and position() = 1]. The first one tells to get a list of all elements with a “2” in their name and pick the first from the resulting list, the second will only match the first element that also has a “2” in its name.

The Axes

In XPath axes specify the direction, they tell the parser where to look for the next element match. MDN defines an axis as “a relationship to the context node … used to locate nodes relative to that node on the tree”. The XPath W3C spec defines a location step – the expression between the two / in an XPath query – as a sequence of the axis specifier, node test, and predicates. In short, parent::home will look up to the parent node (the axis specifier) and only match if the parent node has a name of home (the node test). Sitecore Query’s axes clearly do something differnet.

The axes, just like predicates, are step opcode objects by birth. QueryParser creates them as it parses the query expression. Let’s start with the parent::*.

parent::*

So how does parent::* end up returning child items? Well, let’s see how parents are created:

    protected Step GetParentAxis()
    {
      ...
      Step step = this.m_builder.Parent();
      step.NextStep = this.m_builder.Children(this.GetElement());
      ...
      return step;
    }

Right there! A parent step receives a Children step as its next step by default, or rather by design. Where did the node test part go you might ask? It may not be intuitive but it’s the GetElement() and how the Children step evaluates itself. There’s a caveat though. The GetElement() will get the token following the ::, which in the parent::* case is the * – the node test. The children step will then use this token to perform the node test on … the child items. The * in the parent::* in the Sitecore Query doesn’t translate to the parent item of any name. It translates to any child item of a parent item. That’s why we get the two child items as a result. And that’s why, by the way, parent::home returns null – a nothing found result.

parent::*/*

So how it is possible then that parent::*/* does the same as parent::*? Let’s make a step back. A query is a sequence of path steps. In the /sitecore/content/home/child1/parent::*/* query a child1 element step received a parent::* step as its next step. The next step after that is the /* which, just like you would expect, will translate to children of any name. The key question is – what item will the children of any name apply to? Any guesses at this point? It’s clearly the home item, otherwise the test would fail, but why?? Here. I will show you.

The parent::* returned a step object with a next Children step attached to it. The parser then went ahead and attached a new next step – the /* part to … exactly! the step returned from the GetParentAxis(). And there’s only one NextStep on a step so the ::* part is ignored. Garbage collected. Gone. It won’t even be evaluated. We don’t feel the side effect of the parent::* returning child items when running a parent::*/* query.

It should be clear by now what’s going on with the rest of the examples. The parent::home, as I mentioned earlier, tries to find child items of a child1‘s parent with a name of home and fails. The paren::home/* is a full equivalent of parent::*/* – the part after :: and before / is ignored as we’ve just learned. And the same happens with the idontexist. Ignored. Not evaluated. The following child::* applies to the parent part.

/..

The dot-dot is not parsed via the GetParentAxis() method, there’s a GetParent() method that doesn’t add Children as a Next Step and that’s why it returns the parent item, and not the child items like parent::* does.

We’re out of the second hole.

The End

The trail down the Sitecore Query Hole ends here. I hope you enjoyed it.

Reference: If you want to learn more about opcodes, predicates, and axes I suggest you explore the Sitecore.Data.Query namespace.

Add a Comment

Your email address will not be published. Required fields are marked *

Or request call back