clang-tags – User Manual

Table of Contents

clang-tags is a C / C++ source code indexing tool. Unlike many other indexing tools, clang-tags relies on the clang compiler (via the libclang interface) to analyse and index the source code base.

1 Introduction

Let us consider the following C++ code:

 1: #include "config.h"
 2: #include <iostream>
 4: #ifdef DEBUG
 5: inline void debug (const char* message) {                //
 6:   std::cerr << message << std::endl;
 7: }
 8: #else
 9: inline void debug (const char* message) {}               //
10: #endif
12: template <typename T> struct MyClass {
13:   MyClass () { debug ("MyClass::MyClass"); }             //
15:   void display () {                                      //
16:     std::cout << "MyClass<T>::display()" << std::endl;
17:   }
18: };
20: template <> struct MyClass<int> {
21:   void display () {                                      //
22:     std::cout << "MyClass<int>::display()" << std::endl;
23:   }
24: };
26: int main () {
27:   int display = 3;                                       //
29:   MyClass<double> a;                                     //
30:   a.display();                                           //
32:   MyClass<int> b;
33:   b.display();                                           //
34: }

A few difficulties can be found in this file, making indexing difficult without the help of a full-fledged C++ compiler:

  1. Depending on the DEBUG preprocessor macro, the definition of the function debug might be found at line 5 or 9.
  2. The display identifier is used in several places to mean widely different things (for example at lines 27 and 30). Context information is necessary to disambiguate uses of the same identifier.
  3. Even when display refers to the method of template class MyClass<>, a precise knowledge of the complex C++ template specialization rules is required to associate a.display() with MyClass<T>::display() and b.display() with MyClass<int>::display().
  4. Some functions such as constructors or operators can be called without explicitly appearing in the code. For example, the MyClass::MyClass() constructor is called at line 29.

On the other hand, using a C++ compiler such as clang to parse this code and generate an AST requires information about the project configuration and its build environment:

  1. The config.h file could be automatically generated by autoconf or CMake and reside in a completely different directory than other source files. The compiler will typically need to be provided with -I command-line arguments to know where to find header files.
  2. The DEBUG macro could well be defined by the compilation command line (typically using a -D switch).

A compiler-based indexing tool thus needs to be aware of the full compilation commands which are used to build the project, including for each source file:

  • the full list of command-line switches,
  • the directory from which the file is compiled.

Such information, which we will refer to as a "compilation database" in the following, is usually found in the build system configuration, which can take a wide variety of forms: hand-crafted Makefile, autotools project, CMake project, etc.

1.1 Features

clang-tags aims at providing the following features:

  1. generate a compilation database in a build-system-agnostic way,
  2. index the sources of the project,
  3. use this index to provide access to IDE-like features from a command-line interface :
    • search for symbol definitions in the source code,
    • search for symbol uses in the source code,
    • auto-complete partially entered code,
  4. integrate these features into Emacs.

1.2 First steps for the impatient

All details are explained later, but the quick start guide provides a few steps to get you started if you are too impatient to read.

1.3 Important notions – Terminology

The following terms are used throughout this guide:

Translation Unit
A set of source code which is compiled together. Ordinarily, a translation units consists in a preprocessed source file, in which all header files have been included, macros have been expanded, and so on. In clang-tags a translation unit is identified by the set of command-line arguments which would be needed to compile it into an object file.
The place in the source code where a symbol is declared and/or defined. For each translation unit, a given symbol only has one definition location. However, different definition locations can be found for the same symbol across all translation units. For example, the definition of local variable a in function main of main.cxx appears at line 29.
Each occurrence of a symbol name in the source code is seen as a reference to its definition. For example, the symbol a in expression a.display() on line 30 is a reference to the definition at line 29.
In clang terminology, the name of a symbol as it appears in the source code is referred to as its spelling. For example, the spelling of the symbol defined at line 29 is the string "a".
Unified Symbol Resolution (USR)
A symbol can not be identified by its spelling only: context information is needed to disambiguate uses of the same spelling in different scopes. In order to uniquely identify a symbol across all translation units in a project, clang defines Unified Symbol Resolutions. For example the USR of the display symbol referred to on line 30 is c:MyClass>#ddisplay# whereas line 33 refers to the symbol with USR c:MyClass>#Idisplay#

2 Command-line interface

2.1 Creating the compilation database

clang-tags uses a JSON compilation database to get the information needed to correctly build the project: compile directories and command-line switches. There are different ways to collect this information.

2.1.1 From a CMake project

CMake (since version 2.8.5) supports the generation of a compilation database with the option CMAKE_EXPORT_COMPILE_COMMANDS. For a CMake-managed project, creating the compilation database is thus as simple as:


2.1.2 Tracing the standard build process

usage: clang-tags trace [-h] ...

Create a compilation database by tracing a build command.

positional arguments:
  COMMAND     build command line

optional arguments:
  -h, --help  show this help message and exit

For non CMake-managed projects, there is no "free" way to build the compilation database. One way to get the necessary information consists in inspecting the build process as a black box using strace(1) (also see Bear for a tool using LD_PRELOAD to implement the same kind of strategy).

Such a method is inherently independent of the build process: Makefile (possibly autotools-generated), shell or python script, … However, the downside with this approach is that make and other build systems traditionally only rebuild what's needed, and the generated compilation database can thus be incomplete. Such methods also depend on platform-specific features to inspect the build process.

Example usage:

make clean
clang-tags trace make
rm -f main.o
g++ -c -o main.o ../src/main.cxx
g++ -o main main.o

2.1.3 Scanning the sources directory

usage: clang-tags scan [-h] [--compiler COMPILER] srcdir ...

Create a compilation database by scanning a source directory

positional arguments:
  srcdir                top sources directory
  CLANG_ARGS            additional clang command-line arguments

optional arguments:
  -h, --help            show this help message and exit
  --compiler COMPILER, -c COMPILER
                        compiler name (default: gcc)

For relatively simple projects, it can be sufficient to simply scan the top sources directory to find all *.c or *.cxx files, and additionally provide clang-tags with a set of command-line arguments necessary for clang to parse these files.

Example usage:

clang-tags scan ../src -- -I.

2.2 Indexing the source files

2.2.1 Creating the index

usage: clang-tags index [-h] [--exclude DIR] [--exclude-clear]

Create an index of all tags in the source code base. Source files and compilation commands
are taken from a clang "compilation database" in JSON format, previously read using the
"load" subcommand.

optional arguments:
  -h, --help            show this help message and exit
  --exclude DIR, -e DIR
                        do not index files under DIR
  --exclude-clear, -E   reset exclude list

This command uses the compilation database to index all source files.

Example usage:

clang-tags index
Server response:

-- Indexing project
  parsing...    0.163754s.
  indexing...   0.024179s.

2.2.2 Updating the index

usage: clang-tags update [-h]

Update the source code base index, using the same arguments as previous call to `index'

optional arguments:
  -h, --help  show this help message and exit

This command updates the index.

2.3 Looking for symbols

2.3.1 Finding the definition of a symbol

usage: clang-tags find-def [-h] [--index] [--recompile] FILE_NAME OFFSET

Find the definition location of an identifier in a source file.

positional arguments:
  FILE_NAME        source file name
  OFFSET           offset in bytes

optional arguments:
  -h, --help       show this help message and exit
  --index, -i      look for the definition in the index
  --recompile, -r  recompile the file to find the definition

Example usage:

clang-tags find-def -i ../src/main.cxx 942
Server response:
-- display -- MemberRefExpr display
   ../src/main.cxx:21-23:8-3: display

-- b.display() -- CallExpr display
   ../src/main.cxx:21-23:8-3: display

-- main () { int display = 3; //(ref:displ... -- FunctionDecl main
   ../src/main.cxx:26-34:5-1: main

2.3.2 Looking for all references to a symbol

usage: clang-tags grep [-h] USR

Find all uses of a definition, identified by its USR (Unified Symbol Resolution). Outputs
results in a grep-like format.

positional arguments:
  USR         USR for the definition

optional arguments:
  -h, --help  show this help message and exit

Example usage:

clang-tags grep 'c:@S@MyClass>#I@F@display#'
Server response:
../src/main.cxx:21:  void display () {                                      // (defDisplayInt)
../src/main.cxx:33:  b.display();                                           // (display3)
../src/main.cxx:33:  b.display();                                           // (display3)

3 Emacs user interface

First, load the package using M-x load-file RET path/to/clang-tags.el RET

With the configuration file generated by the clang-tags index command, all C/C++ source files in the indexed source directory should automatically activate clang-tags-mode and have the ct/default-directory variable point to the index directory.

3.1 Find the definition of the symbol at point

While in a source buffer, you can use clang-tags to find the location of the definition of the symbol under point by pressing M-<dot>.

The list of relevant definitions is presented in a buffer, where pressing RET will take you to the location of the definition.

3.2 Find all uses of a definition in the source base

After having looked for a definition of the symbol under point, and while in the definitions list buffer, press M-<comma> to list all uses of the current definition in the source code base.

Results are presented in a grep-mode buffer.

4 Contributing

Please do!

If you make improvements to this code or have suggestions, please do not hesitate to fork the repository or submit bug reports on github. The repository's URL is:

A doxygen documentation targeted at developers is available here.

5 See also

Emacs 23.3.1 (Org mode 8.0.6)

Validate XHTML 1.0