Path profiling interface proposal

Hello,

I am planning on contributing path profiling to LLVM by the end of August. I have written a document of what the interface to the path profiles would look like at that time. If someone has any amendments, I can incorporate them.

http://www.cs.ualberta.ca/~pejic/PathProfiling.html

Slobodan Pejic

Hello,

I am planning on contributing path profiling to LLVM by the end of
August. I have written a document of what the interface to the path
profiles would look like at that time. If someone has any amendments, I
can incorporate them.

http://www.cs.ualberta.ca/~pejic/PathProfiling.html

Slobodan,

This looks interesting. I have a few questions.

Is there anything about your proposed profiling method that requires the
use of lli or llvm-ld? I'm assuming the instrumentor itself will simply
be a Pass. Will this work in setups that use the machine's native linker
as long as it links in libprofile_rt?

Is libprofile_rt built during the LLVM build process? Is the intent for
libprofile_rt to be the home for all profiling runtime or just path
profiling? If the latter, I'd suggest a different name, like
libpathprofile_rt. I'm not knowledgeable about the LLVM profiling
infrastructure, so maybe libprofile_rt is something that already exists.

PathProfileInfo& profileInfo = getAnalysis <PathProfileInfo>();

PathProfileInfo is an ImmutablePass, yes?

/*
* The following method allows querying for path numbers by specifying
* F, the function that the paths are in;
* start, starting iterator over basic blocks;
* end, ending iterator over basic blocks;
* order, ascending or descending ordering by count.
* Returns an iterator over Path objects.
*/
PathProfileInfo::iterator PathProfileInfo::selectPaths (
    Function &F,
    std::vector<BasicBlock*>::iterator start,
    std::vector<BasicBlock*>::iterator end,
    PathProfileInfo::Order order);

I don't understand what this means. Is this an interface to get at the paths
in some subset of a Function's basic blocks? If so, how would one specify a
disjoint set of basic blocks to examine? I don't know if that's useful but if
you're providing an interface to iterate over subsets of path data it seems
natural to want to generalize it.

I also don't understand the "order" parameter. How are paths ordered? By
number? Why does the returned order of paths matter?

/*
* Returns the execution count for a path.
*/
unsigned long Path::getCount()

/*
* Returns the execution frequency for a path.
*/
float Path::getFrequency()

The naming is a bit confusing here. "getCount" makes sense, but
"getFrequency" could mean a few different things. My assumption is that it
means the % of times this path was taken vs. the number of times the Function
was invoked. It could also mean the % of times this path was executed
relative to some "parent" path (think of a subpath through conditional code,
i.e. for getting branch taken/not taken frequencies).

Some of these questions are answered later in your document but they really
need to be specified up front. PathProfileInfo::DESC, for example, has no
meaning outside of the comments in this context:

/* Query for paths in hot to cold order. */
PathProfileInfo::iterator path = PI.selectPaths(F, blocksRequired.begin(),
  blocksRequired.end(), PathProfileInfo::DESC);

All in all, looks good. This will be a useful addition.

                               -Dave

Hello,

I am planning on contributing path profiling to LLVM by the end of
August. I have written a document of what the interface to the path
profiles would look like at that time. If someone has any amendments, I
can incorporate them.

One more comment:

/* Query for paths in hot to cold order. */
PathProfileInfo::iterator path = PI.selectPaths(F, blocksRequired.begin(),
  blocksRequired.end(), PathProfileInfo::DESC);
/* Iterating over the first path which is the hottest. */
for(Path::iterator block = path->begin(); block != path->end(); ++block)
{
  cerr << "Traversing basic block: " << block->getName() << "\n";
}

This seems a bit too cute to me. There's no reason clients can't iterate
over the returned paths and find the hot one. Standard C++ algorithms
can do this very easily. I don't think it's worth the complexity of an
"order" parameter in the query interface and the sorting, etc. that that
implies. If there's a large number of paths, the client can do the sorting
itself. A PathCount() interface might be useful.

                                  -Dave

David Greene wrote:

Hello,

I am planning on contributing path profiling to LLVM by the end of
August. I have written a document of what the interface to the path
profiles would look like at that time. If someone has any amendments, I
can incorporate them.

http://www.cs.ualberta.ca/~pejic/PathProfiling.html

Slobodan,

This looks interesting. I have a few questions.

Thanks for taking the time to look through!

Is there anything about your proposed profiling method that requires the
use of lli or llvm-ld? I'm assuming the instrumentor itself will simply
be a Pass. Will this work in setups that use the machine's native linker
as long as it links in libprofile_rt?

The current static library libprofile_rt is being built as a .bca file
(archive of .bc files.) Presumably the build process can be changed to
create a native archive, thus allowing the native linker to be used. I
will give this a try when I get a chance.

This section was included because running edge profiling from c sources
to the final executable threw me off when I was familiarizing myself
with LLVM. The edge instrumentor can only instrument modules with a
main function, so any modules that one wants instrumented must be linked
to the module with main first. My Instrumetor is similar in structure.

Is libprofile_rt built during the LLVM build process? Is the intent for
libprofile_rt to be the home for all profiling runtime or just path
profiling? If the latter, I'd suggest a different name, like
libpathprofile_rt. I'm not knowledgeable about the LLVM profiling
infrastructure, so maybe libprofile_rt is something that already exists.

Yes, libprofile_rt already exists, however it must be built by invoking
make in runtime/.

PathProfileInfo& profileInfo = getAnalysis <PathProfileInfo>();

PathProfileInfo is an ImmutablePass, yes?

/*
* The following method allows querying for path numbers by specifying
* F, the function that the paths are in;
* start, starting iterator over basic blocks;
* end, ending iterator over basic blocks;
* order, ascending or descending ordering by count.
* Returns an iterator over Path objects.
*/
PathProfileInfo::iterator PathProfileInfo::selectPaths (
    Function &F,
    std::vector<BasicBlock*>::iterator start,
    std::vector<BasicBlock*>::iterator end,
    PathProfileInfo::Order order);

I don't understand what this means. Is this an interface to get at the paths
in some subset of a Function's basic blocks? If so, how would one specify a
disjoint set of basic blocks to examine? I don't know if that's useful but if
you're providing an interface to iterate over subsets of path data it seems
natural to want to generalize it.

The function is supposed to allow the compiler developer to find the
subset of paths that contain certain basic blocks. E.g. consider this CFG:

  A
/ \
B C
\ /
  D
/ \
E F
    >\
    H I

If one passes basic blocks B, and F, then selectPaths would return an
iterator over ( (A,B,D,F,H), (A,B,D,F,I) ).

I will need to make this clearer in the document.

I also don't understand the "order" parameter. How are paths ordered? By
number? Why does the returned order of paths matter?

The idea is to load as little as possible from the on-disk profile if
the entire profile is not required. In order to delay loading the
profile (which may have been sorted by count in the runtime library
saving the profile), the iterator over paths can load the data lazily.
Some potential passes may want to look at the top paths that account for
80% execution, and will not require the majority of the path profile.

/*
* Returns the execution count for a path.
*/
unsigned long Path::getCount()

/*
* Returns the execution frequency for a path.
*/
float Path::getFrequency()

The naming is a bit confusing here. "getCount" makes sense, but
"getFrequency" could mean a few different things. My assumption is that it
means the % of times this path was taken vs. the number of times the Function
was invoked. It could also mean the % of times this path was executed
relative to some "parent" path (think of a subpath through conditional code,
i.e. for getting branch taken/not taken frequencies).

Your assumption is correct. This function can be renamed to
getFrequencyPerFunction().

Some of these questions are answered later in your document but they really
need to be specified up front. PathProfileInfo::DESC, for example, has no
meaning outside of the comments in this context:

/* Query for paths in hot to cold order. */
PathProfileInfo::iterator path = PI.selectPaths(F, blocksRequired.begin(),
  blocksRequired.end(), PathProfileInfo::DESC);

All in all, looks good. This will be a useful addition.

Thanks for the feedback.

This section was included because running edge profiling from c sources
to the final executable threw me off when I was familiarizing myself
with LLVM. The edge instrumentor can only instrument modules with a
main function, so any modules that one wants instrumented must be linked
to the module with main first. My Instrumetor is similar in structure.

Ooh, that's a nasty restriction. Is it inherent to the algorithm or
just the implementation? To be generally useful we need to remove this
restriction.

Yes, libprofile_rt already exists, however it must be built by invoking
make in runtime/.

Ok, that's no problem.

The function is supposed to allow the compiler developer to find the
subset of paths that contain certain basic blocks. E.g. consider this CFG:

Ah, ok, that makes sense. The comments should say something about this.

> I also don't understand the "order" parameter. How are paths ordered?
> By number? Why does the returned order of paths matter?

The idea is to load as little as possible from the on-disk profile if
the entire profile is not required. In order to delay loading the
profile (which may have been sorted by count in the runtime library
saving the profile), the iterator over paths can load the data lazily.
Some potential passes may want to look at the top paths that account for
80% execution, and will not require the majority of the path profile.

How large are the profile files, generally? I'd imagine they can get pretty
huge. Lazy loading is a good idea. Perhaps the comments can explain this
a bit more and provide the various values defined for "order" and their
meaning.

Your assumption is correct. This function can be renamed to
getFrequencyPerFunction().

I'm not sure that's any clearer. Unfortunately, I can't think of anything
better at the moment. In any case, the comments should explain what it is.

Thanks for your work on this!

                             -Dave