Better C++ Syntax Highlighting - Part 8: Qualifiers
Throughout this series, we’ve focused on annotating the targets of AST nodes — types, functions, member variables, and so on. However, many of these nodes also contain qualifiers, such as namespaces or class names, that appear on type names, static member references, and function calls.
Consider the following example, taken from a previous post in this series:
With all of our existing visitors enabled, the output looks like this:
While the type names and member variables are properly annotated, notice how the namespace and class qualifiers in expressions like:
std::[[function,sqrt]],using namespace math::[[namespace-name,detail]],math::[[class-name,Vector3]]in type aliases, andmath::[[class-name,Vector3]]::[[member-variable,zero]]in static member references
remain partially unannotated.
In the case of math::[[class-name,Vector3]], you’ll notice the qualifying class name is already highlighted.
That’s because our TypeLoc visitor handles annotating the type portion of the expression.
However, namespace qualifiers like math:: or std:: are not being annotated yet.
In this post, we’ll implement a generic way to add annotations to namespace, class, and enum qualifiers on types, functions, and other declarations.
Annotating qualifiers
To retrieve qualifier information, we’ll introduce a new extract_qualifiers helper function.
This function walks up the declaration hierarchy of a given AST node, recording any enclosing namespaces, classes, and enums it encounters along the way.
As before, we’ll access this hierarchy through the DeclContext chain of a given node:
The function achieves this by repeatedly calling getParent(), which takes one step outward in the declaration hierarchy.
This continues until the root translation unit, whose parent is null.
Qualifiers get collected into a QualifierList, a lightweight utility class for tracking both the annotation used for each qualifier.
The QualifierList also provides simple lookup functions for determining if a given token is a known qualifier, and retrieving the annotation.
Its implementation is straightforward, as it just registers the proper annotation for each qualifier encountered:
Once we have the DeclContext of a node, we can easily retrieve its qualifiers and apply annotations using the same tokenization pattern we’ve used in earlier posts.
To simplify its usage, we’ll wrap this logic in a visit_qualifiers() helper function:
We check each token to see if it matches a known qualifier using the contains() function.
If it does, we use get_qualifier_type() to determine the correct annotation - namespace-name for namespaces, class-name for classes, or enum-name for enums.
All that’s left is to add calls to this function in visitors for nodes that may contain qualifiers.
Adding qualifiers
Enums
When an enum is declared as an enum class, references to its constants must include the enum name as a qualifier:
To handle this, we’ll update the VisitDeclRefExpr visitor to annotate qualifiers in addition to the referenced enum constant:
Note that visit_qualifiers() is called unconditionally, as VisitDeclRefExpr handles more than just enum constants.
This way, other nodes (such as references to static class member variables) also get their qualifiers annotated.
Namespaces
Namespace aliases and using namespace directives are another common place where qualifiers appear:
The existing VisitNamespaceAliasDecl and VisitUsingDirectiveDecl visitors only annotated the final utility namespace in the chain and left qualifiers like math unannotated.
Adding a visit_qualifiers() call to both visitors ensures that nested namespace qualifiers are annotated correctly.
For namespace aliases:
Here, we annotate using the DeclContext of the namespace being aliased to properly capture the entire namespace chain.
For using namespace directives, we annotate any qualifiers on the nominated namespace:
With both of these visitors updated, qualifiers in namespace directives are now properly handled:
Functions
To annotate qualifiers on both function declarations and calls, we need to update a few existing visitor implementations. Consider the following example:
For annotating qualifiers on function declarations, such as out-of-line class member functions, we’ll update the VisitFunctionDecl visitor:
Notice the source range:
This range covers all tokens from the start of the declaration (including qualifiers like Vector3::) up to the function name (which we don’t care about, since we’ve already annotated it before).
This ensures we annotate all relevant qualifiers consistently, even for constructors, destructors, and overloaded operators.
Using getTypeSpecStartLoc() as the start of the range is insufficient in those cases, as constructors and destructors lack a return type.
This approach also properly annotates definitions and forward declarations of namespace-qualified global functions, such as:
For explicitly-qualified operator calls, such as a.Vector3::operator+(b), we’ll annotate qualifiers in VisitCXXOperatorCallExpr:
The range:
ensures that only the relevant tokens before the operator symbol are parsed for qualifiers (e.g. Vector3::operator).
All other function calls, including qualified regular function calls, are handled by VisitCallExpr:
With these visitors updated, qualifiers for function declarations and calls are properly annotated:
Classes
Most qualifier annotations pertaining to classes are already handled by the updates made earlier, but static class member definitions still require attention. Consider the following example:
Static class member definitions are handled by the VisitVarDecl visitor:
Unfortunately, there is no direct way to retrieve the source range of just the qualified member name.
The best candidate, node->getTypeSpecStartLoc(), points to the start of the type, not the member itself:
To annotate only the qualifiers on the member name, we’ll offset the range by the length of the fully-qualified typename, obtained via node->getType().getUnqualifiedType().getAsString().
The getUnqualifiedType() call ensures that qualifiers like const, volatile, or references / pointers are stripped away, leaving only the name of the type.
With this approach, we can annotate just the qualifiers on the member:
Annotations for references to the static member itself are already handled by the qualifier logic added to VisitDeclRefExpr earlier.
Similarly, we’ll update the VisitCXXTemporaryObjectExpr visitor to annotate qualifiers for temporary object constructors themselves (such as the initialization of static class member Vector3::zero to math::Vector3()).
This is a simple addition to the visitor:
This change ensures that annotations are consistent for any namespace or class qualifiers that appear before a temporary constructor call. Now, all constructor calls (whether for temporary objects or otherwise) have uniform qualifier annotations.
Finally, we’ll extend VisitMemberExpr to account for explicit qualifiers on class member variables, such as accessing a member from a base class:
We’ll modify VisitMemberExpr to annotate the qualifier when present:
The hasQualifier() check ensures we only process nodes that explicitly include a qualifier.
This avoids unnecessary work for standard member accesses without qualifiers.
With the visitor updated, qualifiers on member accesses are properly annotated:
Types
TypeLoc nodes represent type references in variable declarations, function parameters and return values, template arguments, and more.
With our current visitor setup, we can add qualifier annotations for these types with only a few small tweaks.
Take the RecordTypeLoc case as an example:
If we update the implementation to also extract the declaration context in addition to the type name, giving us everything we need to annotate the type and its qualifiers.
The challenge is that for most TypeLoc nodes, the location provided by getBeginLoc() refers only to the type name itself, and not any preceding namespace or class qualifiers.
Manually querying the parent AST node for ranges is possible, but inefficient.
For example, the parent AST node of a function parameter is a FunctionDecl node, but retrieving its range means tokenizing the entire function signature and body.
While this is a viable strategy, it is completely overkill.
Instead, we can tokenize backwards from the type name, annotating qualifiers while walking to the beginning of the qualified type expression.
This avoids any unnecessary work and isolates exactly what we need.
We can employ the use of the iterator functions from the Tokenizer to implement this incremental traversal logic in a new visit_qualifiers helper:
While the current token is a known qualifier or a :: separator, we know we are still annotating part of the qualified type.
As soon as this condition is no longer true, we know we’ve reached the end.
Annotating qualifiers on TypeLoc nodes is now trivial:
Concepts
Certain nodes in concept contexts are not processed by the VisitTypeLoc visitor, despite appearing and functioning as types.
For example:
Qualifiers are not annotated on type constraints in trailing requirements such as std::same_as as well as concept specializations like detail::Iterable<T> and on template parameters.
To support this, we’ll extend the VisitConceptSpecializationExpr, VisitRequiresExpr, and VisitTemplateTypeParmDecl visitors to annotate qualifiers explicitly.
The VisitConceptSpecializationExpr visitor already handles annotating the concept name itself.
We’ll extend this function to also annotate any namespace or class qualifiers as well:
The VisitRequiresExpr visitor is responsible for annotating trailing type constraints within requires expressions, such as:
Similarly, we’ll update this visitor to also annotate qualifiers in addition to the concept name:
This uses the visit_qualifiers() overload we implemented in the previous section.
Alternatively, a range-based approach works just as well:
This annotates tokens from the end of the requires expression to type name of the constraint:
The standard guarantees that only one type constraint may appear in a trailing requirement, so this approach is equally viable.
For constraints on template parameters, we’ll update VisitTemplateTypeParmDecl:
Clang doesn’t expose the full source range for the constraint, only allowing the location of the type name itself. However, since type constraints on template parameters function structurally identical to types, we can employ the reverse-tokenization approach for this case as well.
With these visitors updated, qualifiers in concept contexts are now properly annotated:
Due to their dependence on an ambiguous type T, it is not currently possible to annotate qualifiers on UnresolvedLookupExpr expressions like std::begin() or std::end().
For now, such cases will require manual annotation.
In this post, we implemented a generic way to annotate keywords across a variety of different contexts.
Note that this was not intended to provide exhaustive handling for all AST nodes that may contain qualifiers.
Many of these cases were discovered incrementally while working through real-world examples, so it’s likely that additional nodes or edge cases exist that aren’t covered here.
Fortunately, the visit_qualifiers() functions are designed to be reusable and easy to integrate into new visitors as new and unhandled cases arise.
In the <LocalLink text={“next post”} to={“Better C++ Syntax Highlighting - Part 9: Preprocessor”}>, we’ll hook into the Clang preprocessor to annotate preprocessor directives such as file includes, macro definitions, and conditional compilation directives. Thanks for reading!