TableGen Jupyter notebook should have a way to filter compiler output

Filing this to collect some ideas for future work, only one person I know of has tried this and probably worked around it by using the compiler directly.

**Problem**

If you use LLVM TableGen files like `Target.td` in a notebook, the `llvm-tblgen` output is > 320,000 lines. This breaks the limit Jupyter sets and removing that limit likely makes the client crash.

You might do this if you wanted to make a notebook about adding some LLVM internal thing like a scheduler or an instruction. You wouldn't want every cell to be massive even if the notebook could handle the text.

It's a niche that most people won't hit, so it needs input from people who do to decide what the best tradeoffs are. I don't want to create more things for folks to learn in the process.

**Possible Solutions**

* Arbitrary cut off for the output, basically `tail <N>`.
  * Easiest to understand, but zero nuance.
  * If the content of the includes changes between versions then your `<N>` may need to change.
  * Let's not do this, but writing it here as the "baseline" from which to compare better options.
* Detect the output is too large and return an error to the notebook telling them to use the compiler directly.
  * We're not actually fixing anything, but at least it's clearer.
* Emitting JSON and running one of the JSON query languages on it.
  * Now you’re learning yet another language.
  * The result is more JSON, not the record format you’re used to.
* Pragmas/notes to mark include file content in the output.
  * No way to tell “user” vs. “system” includes apart right now.
  * You may want to see some subset of an included file anyway.
* Regular expression for class and definition names.
  * If we use JSON, same issues as before.
  * Probably could match on the output, but likely easier to make it a compiler option.
  * You are now learning regex but at least there are sites that make building a regex easy, unlike a JSON query language I expect.
  * Is 2 expressions enough, what about multiclass?
* Marking "new" records somehow by comparing the previous output.
  * You may want a mix of old and new in the output.
  * Still leaves 300k of lines in the first cell even if you only want new stuff in the next ones.
  * Not sure we can reliably detect "new" given that the order may not be deterministic.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TableGen Jupyter notebook should have a way to filter compiler output #72856

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TableGen Jupyter notebook should have a way to filter compiler output #72856

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions