C++20 Module Support For SCons: Part 2, The Model

What’s New?

In terms of the build graph, we’re bringing a new kind of node into being: the CMI files. So I guess we need to know more about the CMI files now.

Basically, for every source that imports an MI, let’s say the source has a compilation context. It could include the compilation flags, the compiler version, etc. Then, for each MI, we would almost always need a unique CMI file for each compilation context in which it is imported.

You can smell the problem now. Traditional build graph automation facilities do not support this kind of "additional reverse tracing". In C++20 module building, let’s just assume that through great dependency scanning, we know that both a.cc and b.cc import a module m exported by m.cc. Good. What next? The next thing is to find out exactly the compilation context of a.cc and b.cc. Then we trace back, creating a m.<a_context>.gcm for a.cc and a m.<b_context>.gcm for b.cc, finding a recipe for them and listing them as prerequisites of a.cc and b.cc.

A dead loop for existing build graph automation facilities: to them, the CMI file a.cc uses is surely part of its compilation context. But to know the CMI file a.cc uses, we need to know the compilation context of a.cc.

The problem might show up in different forms, according to which build graph automation facility you’re having in mind. But esssentially, it’s a dead loop problem.

It can be solved, and is the story for next section. Now let’s assume we have already found the right CMI files and focus on how we stuff them into the build graph.

Problems

Order-only Prerequisites Don’t Work

Order-only prerequisites is a concept that many people are familiar with. Basically, when A is an order-only dependency of B, there hold:

A state	B state	Behavior
not there	needs update	builds A, then B
not there	up to date	do nothing
up to date	needs update	builds B
needs update	up to date	builds A
needs update	needs update	builds A, then B (order guaranteed)

A state

B state

Behavior

not there

needs update

builds A, then B

not there

up to date

do nothing

up to date

needs update

builds B

needs update

up to date

builds A

needs update

builds A, then B (order guaranteed)

It is what it literally means. "Order-only" means it only works on the order of the building of two targets, if they are all going to be built. "Prerequisite" means it’s a prerequisite after all, so if A is not there, we will at least build it.

Now where does this fit in? Let’s imagine A as an MI (a CMI file), and B as a source (an object file). Now if:

the MI changes, but the source doesn’t.
- If the MI is compatible, we don’t need to rebuild the object file, as the CMI files are intermediate information for correctness check and don’t contribute to the codegen. If the MI is compatible, the object file remains correct.
- If the interface is incompatible, although we can’t abort the build right away, the compilation will fail at some stage anyway. We don’t need to warn the user here and now. (Though if we could it would be amazing.)
the MI doesn’t change, but the source does.
- Just rebuild the object file, bro. pretty obvious.
the MI and the source both change.
- If the MI is compatible, rebuilding both in the correct order will keep the CMI file up-to-date and object file correct. Great.
- If the MI is incompatible, the newly built, up-to-date CMI file will help the compiler generate the correct error message when re-compiling the object file. Expected behavior observed.

"Compatible" means any changes to the MI won’t affect the codegen of the object file. Now doesn’t it sound amazing? For one time, the model fits the reality. But sadly no.

There’s one thing in C++ that’s called conditional compilation. It’s not the preprocessors, which doesn’t exist in modules (though exists in header units). It’s the if constexpr statement. An MI can export a compile-time constant, which might be used in a if constexpr statement for conditional compilation. So far, we don’t have a way to judge if this behavior would happen.

In enginerring, we always think of the worst case. So this fact practically "incompatifies" all changes in an MI. It means that we’ll always have to recompile the object file if the MI changes. So, no. No use of order-only prerequisites between a source and the MIs it imports.

One Shot From Fortran

(Correct me if I’m wrong. I’m not super familiar with Fortran.)

We already mentioned that compilers can emit CMIs as by-products of object files. This is often called "one-shot" compilation.

We also know that the by-product CMI has approximately no value, because of the compatibility issue. In fact, we always want to arrange the CMI generation according to the importing files' compilation context.

Then why not discard them? Surely we can. But if we don’t, it would be great. Like mentioned, the tooling of C++20 modules is an analog of Fortran modules. As a matter of fact, it’s so much of a analog that certain infrastructure like P1689R5 is designed to be used for Fortran at the same time. And Fortran compilers, for God’s sake, only support one-shot compilation. It also does not suffer from the compatibility issue. It also has been their tradition for decades.

So we can’t discard them, for our beautiful dream of building shared infrastructure.

It’s just a minor issue, you know. If we build a good enough system, adding one more CMI instance is just 1 or 2 lines of code. Nevertheless, it’s a design decision that worths a section.

Collate

The key to C++20 module support and to solve the dead loop problem is a 2-stage build model. The new stage is called "collate".

For any build system, before arranging the build graph and starting the actual build, we have to collect the dependency information. That’s true for C++20 modules as well. The only difference is, for traditional header files, an #include directive is immediately resolved at compiler level (the compiler looks up the file in the include search path), while for C++20 modules, resolving an import directive is more like the build system’s job.

You know, when you #include a header file, the name is the file itself. But when you import or export a module, you’re only given a name, not a file. The mapping between the canonical module name and the actual file on disk is done by the build system.

Collating is doing the mapping job. Given a set of source files, it will find out which file exports which MI, and which file imports which MI. Then it instantiates a CMI file for each MI in each compilation context.

That’s how you find out the right CMI files for each importing file. After collating, we can safely pass the information to the build graph automation facility like usual.

Doing it as a separate stage is almost necessary, not only because it’s the only practical way to solve the dead loop problem (unless you want the build graph automation facilities to have some powerful scripting abilities, which would be a disaster for performance, brevity, consistency, maintainability, etc.), but also because it enables the build system to know more about the build, which could be crucial for extended build system features, like version management, package management, and compilation caching.

But sure. It’s not easy. It’s never easy.

Problems

Dynamic Module Mapping

The 1st and probably the biggest trouble is the dynamicness of compilation context that further leads to the dynamicness of module mapping.

The "dynamicness", of course, is within the scope of the collating process. Here’s an unlikely but theoretically important example:

// m_option1.cc

#if defined(OPTION1)

export module m;

// something identical...

#endif

// m_option2.cc

#if defined(OPTION2)

export module m;

// something identical...

#endif

// a.cc

import m;

// do something...

// b.cc

import m;

// do something...

And now we’ll be damned. Imagine a.cc has a compilation flag -DOPTION1 and b.cc has a compilation flag -DOPTION2. They import that same module m, but from different sources.

Or imagine that they don’t directly import m, but both import a module n, which imports m. In this case, there will be 2 CMI files for n, inheriting a compilation context from a.cc and b.cc respectively. Now technically speaking, an identical module n is importing 2 different module ms.

Again, it’s a highly unlikely scenario. But it does hint at the potential problems.

The solution, however, is quite easy to come up with. We just have to rescan the source files in each new compilation context.

Performance vs. Correctness

While rescanning solves the dynamicness issue, it could be a performance killer. After all, though preprocessing for module dependency scanning is quite fast, it’s still preprocessing. Let alone the fact that in real-world scenarios, there’re likely very, very many compilation contexts in a project.

Do we have a way to strike a balance between performance and correctness?

No. Not that I can think of. The solution I have to this, is "let the user decide".

Based on some assumptions, like "there will never be more than one file exporting one identical module under any compilation context", we can optimize away a lot of rescanning. For example, under this assumption, we scan and match the first file, and pass the current scanning result to its importing files as a kickstarter cache. What we do is, when mapping imports for this next file, we try to find a match in the kickstarter cache. If found, we just rescan that particular file. If the rescan still matches, we see it as a positive and skip the rest of the rescanning. And we do it recursively until the end of the importing chain.

I haven’t really sat down and do a formal proof for it. But I think the correctness does not break under the assumption, and it should avoid a lot of unnecessary rescanning.

The trick is, we make it an option. Say, "smart-rescan" mode.

Then when correctness is crucial and you don’t have these assumptions at hand, you can turn to "always-full-rescan" mode. Or "paranoid" mode, I guess. Of course, for the simplest scenario, we can have a "never-rescan" mode or something like that. I know. I know naming is an important business. But I just don’t have the talent.

But anyway, we’re able to find the right CMI paths for the build now, thus avoiding the dead loop problem that would otherwise occur later.

Also, all kinds of caching mechanisms should be provided. That’s also an important performance booster.

Of course, they’re just the best ideas I can offer. If you have a perfect solution or anything like that, I’m all ears.

The Model

Now the whole picture is clear. We have decided to do a 2-stage static build, and we have planned out the specific options and workflows for the collating stage, whose result is later fed into the build graph automation facility, so we don’t have to modify our build backend nor damage our performance.

The rest is implementation details, like how to perform all the caching magic, and how to fit the whole model into SCons. Caching could be improved incrementally, so let’s talk about my image of doing it in SCons.

We’ll have a new Program builder, but this time a pseudo-builder. It first tests if it’s a C++ file. If so, it saves the file metadata to a mapper instance. The instance could be a attribute for the Environment, or a global instance, or anything that could be located. The mapper is the one that takes care of everything, from the visibility control of modules to the CMI instantiation. It would do the caching internally as well. Then our new pseudo-builder revokes the old Program builder as usual.

Meanwhile the C++ source scanner has to change. It should call mapper for the paths of the of CMI file instantiations that the source needs, and populates them as valid Nodes with the correct build action.

Note that a global mapper is necessary. There must be a space where we put all the source files into, and only after every source file is there can we begin scanning and collating. A pseudo-builder and a source scanner look perfect for this job.

Just a vague vision. I’m not super familiar with the SCons as well :(

I haven’t implemented the full model described in this post yet. I’ve been implementing prototypes, refactoring them, getting pissed off, and refactoring again. The iteration of idea, however, seems arriving its fine end here, which is why I’m writing this post.

To Be Continued

The next post should be an "implementation report" of sorts. So wait until I implement it. I’ll try to have a WIP GitHub repository, though I’m not sure how I should do it. (I think it’s difficult to work on an incomplete codebase with others, especially in the form of git.)

Good luck to you all, guys. And I curse WG21 for this module mess.