Skip to content

Generate Lib/opcode.py from Python/bytecodes.c #102674

Closed
@gvanrossum

Description

@gvanrossum

This could also auto-generate Include/opcode.h, Include/internal/pycore_opcode.h and Python/opcode_targets.h, subsuming both Tools/build/generate_opcode_h.py and Python/makeopcodetargets.py -- although the simplest approach would probably be to just keep those tools and only ensure that they are re-run after opcode.py is updated.

Variables to generate

The auto-generatable contents of opcode.py is as follows:

  • opmap -- make up opcode numbers in a pass after reading all input; special-case CACHE = 0
  • HAVE_ARGUMENT -- done while making up opcode numbers (group argument-less ones in lower group)
  • ENABLE_SPECIALIZATION = True
  • EXTENDED_ARG -- set from opcode for EXTENDED_ARG inst, if there is one
  • opname -- invert opmap
  • pseudo opcode definitions, can add DSL syntax pseudo(NAME) = { name1, name2, ... };
  • hasarg -- can be derived from instr_format metadata
  • hasconst -- could hardcode to LOAD_CONST
  • hasname -- may have to check for occurrences of co_names in code?
  • hasjrel -- check for JUMPBY with arg that doesn't start with INLINE_CACHE_ENTRIES_
  • hasjabs-- no longer used, set to [] for backwards compatibility
  • haslocal -- opcode name contains '_FAST'
  • hascompare -- opcode name starts with COMPARE_
  • hasfree -- opcode name ends in DEREF or CELL or CLOSURE
  • hasexc -- pseudo opcode, name starts with SETUP_
  • MIN_PSEUDO_OPCODE = 256
  • MAX_PSEUDO_OPCODE -- derive from pseudo opcodes
  • __all__ -- hardcode

The following are not public but imported by dis.py so they are still needed:

  • _nb_ops -- just hardcode?
  • _specializations -- derive from families (with some adjustments)
  • _specialized_instructions -- compute from _specializations
  • _specialization_stats -- only used by test__opcode.py, move into there?
  • _cache_format -- compute from cache effects? (how to make up names?)
  • _inline_cache_entries -- compute from _cache_format

The hardcoded stuff can go in prologue and epilogue sections that are updated manually.

This project (if we decide to do it) might be a good reason to refactor generate_cases.py into a library (the code for this shouldn't be shoved into the main file).

Benefits

We can wait on this project until we are sure we need at least one of the following benefits:

  • Once we are generating the numeric opcode values it will be easier to also generate numeric values for micro-opcodes.
  • Avoid having to keep multiple definitions in sync (e.g. families, cache formats).
  • Easier to understand. E.g. where are the numeric opcode values for specialized instructions defined? (Would require also subsuming generate_opcode_h.py.)

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    type-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions