doxygen
Doxygen Internals

Generated on "%A, %B %-d, %Y at %-I:%M %p"

Introduction

This page provides a high-level overview of the internals of doxygen, with links to the relevant parts of the code. This document is intended for developers who want to work on doxygen. Users of doxygen are referred to the User Manual.

The generic starting point of the application is of course the main() function.

Configuration options

Configuration file data is stored in singleton class Config and can be accessed using wrapper macros Config_getString(), Config_getInt(), Config_getList(), Config_getEnum(), and Config_getBool() depending on the type of the option.

The format of the configuration file (options and types) is defined by the file config.xml. As part of the build process, the python script configgen.py will create the files configoptions.cpp, configvalues.h and configvalues.cpp from this, which serves as the input for the configuration file parser that is invoked using Config::parse(). The script configgen.py will also create the documentation for the configuration items, creating the file config.doc.

Gathering Input files

After the configuration is known, the input files are searched using searchInputFiles() and any tag files are read using readTagFile()

Parsing Input files

The function parseFilesSingleThreading() takes care of parsing all files (in case NUM_PROC_THREADS!=1, the function parseFilesMultiThreading() is used instead).

These functions use the ParserManager singleton factory to create a suitable parser object for each file. Each parser implements two abstract interfaces: OutlineParserInterface en CodeParserInterface. The OutlineParserInterface is used to collect information about the symbols that can be documented but does not look into the body of functions. The CodeParserInterface is used for syntax highlighting, but also to collect the symbol references needed for cross reference relations.

If the parser indicates it needs preprocessing via OutlineParserInterface::needsPreprocessing(), doxygen will call Preprocessor::processFile() on the file.

A second step is to convert multiline C++-style comments into C style comments for easier processing later on. As side effect of this step also aliases (ALIASES option) are resolved. The function that performs these 2 tasks is called convertCppComments().

Note: Alias resolution should better be done in a separate step as it is now coupled to C/C++ code and does not work automatically for other languages!

The third step is the actual language parsing and is done by calling OutlineParserInterface::parseInput() on the parser interface returned by the ParserManager.

The result of parsing is a tree of Entry objects. Each Entry object roughly contains the raw data for a symbol and is later converted into a Definition object.

When a parser finds a special comment block in the input, it will do a first pass parsing via CommentScanner::parseCommentBlock(). During this pass the comment block is split into multiple parts if needed. Some data that is later needed is extracted like section labels, xref items, and formulas. Also Markdown markup is processed via Markdown::process() during this pass.

Resolving relations

The Entry objects created and filled during parsing and stored as a tree of Entry nodes, which is kept in memory.

Doxygen does a number of tree walks over the Entry nodes in the tree to build up the data structures needed to produce the output.

The resulting data structures are all children of the generic base class called Definition which holds all non-specific data for a symbol definition.

Definition is an abstract base class. Concrete subclasses are

For doxygen specific concepts the following subclasses are available

Finally the data for members of classes, namespaces, and files is stored in the subclass MemberDef. This class is used for functions, variables, enums, etc, as indicated by MemberDef::memberType().

Producing tracing and debug output

Within doxygen there are a number of ways to obtain debug output. Besides the invasive method of putting print statements in the code there are a number of easier ways to get debug information.

For a debug build (build option -DCMAKE_BUILD_TYPE=Debug) these options are always available, but for a release build some debug capabilities have to be enabled explicitly (see build options -Denable_tracing=YES and -Denable_lex_debug=YES).

To enable tracing use the -t option. You can optionally specify a name of a trace file, if omitted trace.txt will be used. When running doxygen with tracing enabled, doxygen will write a lot of internal information to the trace file, which can be used (by experts) to diagnose problems.

During a run of doxygen it is possible to specify the -d command line option with one of the following values (each option has to be preceded by -d):

Producing output

TODO

Documentation Topics TODO