Skip to content

Intl: Add a new IntlListFormatter class #18519

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

BogdanUngureanu
Copy link
Contributor

This PR adds support for ICU's ListFormatter implementation by implementing a new IntlListFormatter class in PHP.

The class supports ANDs, ORs and UNIT format types. However, they are only available if ICU version is 67 or above. On older versions of ICU, we only allow TYPE_AND and WIDTH_WIDE - we need this because, from what I've understood, the minimum version ICU supported is 57, which was bumped from 50 last year.

The class API is pretty simple:

$formatter = new IntlListFormatter('EN_US', IntlListFormatter::TYPE_AND, IntlListFormatter::WIDTH_WIDE);
echo $formatter->format([1,2,3]);

will display

1, 2, and 3

I have some questions here:

  • Do I need to create an RFC for this new addition?
  • I haven't used a namespace and used Intl prefix for consistency sake. Should it be behind a Intl namespace? I haven't seen a class that uses it so I thought I should go with the prefix.

@TimWolla
Copy link
Member

TimWolla commented May 8, 2025

Do I need to create an RFC for this new addition?

I wouldn't need an RFC if this is a straight forward addition to the existing Intl API which just maps to ICU in a straight forward fashion. I would like to see some smaller improvements, though:

  • Making the class final and /** @not-serializable */. This allows you to avoid the “not properly constructed” checks.
  • /** @strict-properties */ to further lock down the class against misuse.

I haven't used a namespace and used Intl prefix for consistency sake. Should it be behind a Intl namespace? I haven't seen a class that uses it so I thought I should go with the prefix.

The prefix makes sense to me. Using the Intl namespace should properly happen when designing an Intl API that isn't just a straight forward mapping to ICU, but also provides a nice API (e.g. enums, proper class and exception hierarchy, …).

@BogdanUngureanu
Copy link
Contributor Author

Hi @TimWolla! Thanks for the suggestions - great ideas! I've pushed a commit that makes the class final and enabled those flags.

Copy link
Member

@TimWolla TimWolla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more Stub nits. Please do not add PHPDoc comments, unless they allow you to express something that the types themselves cannot.

@BogdanUngureanu
Copy link
Contributor Author

Some more Stub nits. Please do not add PHPDoc comments, unless they allow you to express something that the types themselves cannot.

Thanks for the quick response! I'll make the changes tomorrow. :)

@devnexen
Copy link
Member

devnexen commented May 8, 2025

windows CI failure is related otherwise

@BogdanUngureanu BogdanUngureanu requested a review from devnexen May 10, 2025 22:46
intl_convert_utf8_to_utf16(&ustr, &ustr_len, ZSTR_VAL(str_val), ZSTR_LEN(str_val), &status);
if (U_FAILURE(status)) {
for (uint32_t j = 0; j < i; j++) {
efree((void *)items[j]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do you get a warning during your build if you remove the (void *) cast ? unsure it s really necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, I get a warning:

/listformatter_class.c:158:17: warning: passing 'const UChar *' (aka 'const unsigned short *') to parameter of type 'void *' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the const qualifier ... can items just be mutable and eventually doing the cast at ulistfmt_format ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm... I'm not sure if that's simpler because we also need to convert it to utf16

@devnexen
Copy link
Member

As first round of reviews, look good, I ll play a bit with your branch I think tomorrow.

@BogdanUngureanu
Copy link
Contributor Author

Hi @kocsismate, thanks for the feedback! I've pushed a commit that should address it.

Copy link
Member

@kocsismate kocsismate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm only able to review this in small chunks :(

@BogdanUngureanu
Copy link
Contributor Author

@kocsismate no worries, I'm able to work on it only in the evenings anyway. :)

@devnexen
Copy link
Member

seems the most important points had been covered 👍🏼, I ll look at it sometime this week-end.

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, wait for others for final confirmation

ZEND_HASH_FOREACH_VAL(ht, val) {
zend_string *str_val;

str_val = zval_get_string(val);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One optimization that you could do (optional) is using zval_get_tmp_string instead. Although it probably doesn't matter too much here

int32_t resultLength;
UChar *result = NULL;

resultLength = ulistfmt_format(LISTFORMATTER_OBJECT(obj), items, itemLengths, count, NULL, 0, &status);
Copy link
Member

@devnexen devnexen May 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick question, do you think there is a benefit of using ulistfmt_formatStringsToResult beside the "no need to calculate the buffer first" ? any drawback ? Think more of use case to apply in php rather than api conveniency. Cheers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note I m not holding the PR merge at all, just asking out of curiosity/trying to see a possible future improvement :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what would be the advantage (besides not calculating the buffer). The drawback is that it's supported since ICU 64 and PHP Intl supports 57 and up. I guess we could conditionally use it for >= ICU 64, but personally I would keep the code simple :)

Copy link
Member

@devnexen devnexen May 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say this ... ulistfmt_formatStringsToResult is not necessarily interesting by itself but what you can get out of it however is what is called field positions via a UFieldPositionIterator, going through it with ufieldpositer_next ... you can get nice additional infos ... but again my question was more to give food for thoughts ;)

Copy link
Member

@devnexen devnexen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good as is, played long enough with your branch. @TimWolla to double check the stubs parts.

Copy link
Member

@TimWolla TimWolla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't check the details of the implementation, but the API looks good to me (it's consistent with the rest of ext/intl, just fixing some of the larger gotchas).

@BogdanUngureanu
Copy link
Contributor Author

Hey everyone, thank you so much for the reviews! It's great seeing so many ✅! @devnexen what would be the next step to move forward with this PR? :)

@devnexen
Copy link
Member

Hey everyone, thank you so much for the reviews! It's great seeing so many ✅! @devnexen what would be the next step to move forward with this PR? :)

after the last nitpicks, that should be all if you want you can update UPGRADING yourself or I will.

@BogdanUngureanu
Copy link
Contributor Author

Alright, thanks @devnexen. I've pushed a commit that removes that condition and adds the php ending tag in phpt files. As for updating the UPGRADING, I'll leave it to you if you can :)

@devnexen devnexen closed this in 3f75452 May 18, 2025
@devnexen
Copy link
Member

Thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants