PEP 757 – C API to import-export Python integers

I’m reading the PEP and treating it a bit like the ultimate API documentation that I expect to come out of an implementation (i’m not looking at the PR for now). I think my main issue is the export API feels overly complex and if it is going to be, we need to explicitly capture why that is in the PEP. Otherwise beyond that, I’m highlighting things that could be described better but the APIs feel fine.


Export API

There are two cases:

The feels like a footgun adding needless complexity to an API intended for use primarily with other bignum libraries. It means that all users must now use a conditional to know which data to use and some may think they’re clever by avoiding a call to FreeExport each of which adds conditionals and branches to something that could be done without those. Why are there two cases? I’d simplify this to get rid of .value in favor of always using .digits and always require FreeExport to be called afterwards.

If this was somehow aiming to be efficient, it is too complicated. Keep the simpler API and if you want efficiency for the small case, calling any of the PyLong_As*AndOverflow APIs first and using the result if (!overflow) is a much simpler conditional before calling a digits export API. That way you don’t force people to handle multiple formats and it does not smell like an API of our internal implementation detail of the moment showing. The gmpy2 example would be simpler this way, only two mpz_ calls instead of three.

If there are compelling reasons why the Export API should be this complex, please document why and include a Rejected Ideas section of making it simpler as described.


Import API

PyLongWriter_Create

Document the intent of the function and why it works the way it appears to. I had to read this section multiple times to understand what it meant to be a “writer” and that it wasn’t just a simple API taking an array of digits as input in one call, but was instead an API allocating space for the caller to then later go and fill in digits themselves. This needs documentation as to why it works this way with an example being provided.

Do we do any validation of the digit values filled in? ie: What happens on Finish if values larger than 2**bits_per_digit are present in the digits array? Document that. That being undefined behavior is okay so long as we state it. (Implementation wise, I might leave an assertion loop in place on !NDEBUG builds myself)

ndigits is the number of digits in the digits array. It must be greater than or equal to 0.

What does it mean to have 0 digits? That sounds like not-a-number. Document exactly what 0 digits means and when 0 digits would be used. Is *digits written to when digits == 0? Is this just a way to say 0 via this API without filling in a digit?

The caller can either initialize the array of digits digits and then call PyLongWriter_Finish() to get a Python int, or call PyLongWriter_Discard() to destroy the writer instance. Digits must be in the range [0; (1 << sys.int_info.bits_per_digit) - 1]. Unused digits must be set to 0.

Don’t refer to a value only obtainable from Python. Presumably you meant to reference the added PyLong_GetNativeLayout struct’s bits_per_digit here.

PyLongWriter_Finish & PyLongWriter_Discard

explicitly document that these calls are mutually exclusive. The PyLongWriter is invalid after either call.


In general I agree that official Python C APIs to better play with other bignum implementations have been a long time coming.

I don’t necessarily like that we expose internal Python bigint implementation details, but this keeps the code maintenance burden on our side easy. If, for example, we were to abandon our 15/30bit digits approach to bignums in favor of a classic pure base 2 bit array, these APIs would still be usable. As the bits_per_digit concept can also represent that.

The only thing we’d be unable to do is stop using a binary-multiple base. But that is extremely unlikely to ever be desired as hardware always works in binary-multiple bases and there are no signs of that changing. So I consider it a non-issue.

3 Likes