Skip to content

bpo-20180: use Argument Clinic in itertools module where appropriate #4117

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

taleinat
Copy link
Contributor

@taleinat taleinat commented Oct 25, 2017

I applied my old patch for itertools from the old "AC Derby" days, fixed all that needed to be fixed and made a few minor AC-related fixes as suggested by @serhiy-storchaka on the issue tracker.

To get this working with the separate clinic/itertoolsmodule.c.h file, I also moved all of the method definition arrays and static object definitions to the end of the file, so that the #include statement could come before them.

Many thanks @serhiy-storchaka for reviewing the patch and providing guidance!

https://bugs.python.org/issue20180

This is required to be able to include the generated AC code from
clinic/itertoolsmodule.c.h, which must appear before these definitions
but after all typedefs and type declarations.
@taleinat
Copy link
Contributor Author

Note that I purposefully made this in two commits to make the steps clearer. Consider reviewing each commit individually.

@taleinat
Copy link
Contributor Author

Also note: While moving around and re-formatting doc-strings (required for AC), I also improved some of them. Reviewers, please make sure to review doc-string changes.

@vstinner vstinner self-requested a review October 25, 2017 15:45
Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left reviewing the docstrings on to @rhettinger.

all have the same key. The common key for the group is the first item in the
pair.

Example:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently examples are not included in docstrings. I think it is better to keep examples in the module documentation and keep docstrings brief and clear.

Copy link
Contributor Author

@taleinat taleinat Oct 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were already examples in some of the doc-strings, e.g. for product() and combinations(), so I added some to doc-strings where there were no examples based on examples I found in the docs.

@rhettinger, should I remove all examples from doc-strings, keep only those which were already in the doc-strings, or keep the new examples as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before adding an example, I would try adding a clear summary of the class. Why did you remove the original summary, “an iterator that returns consecutive keys and groups”?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually don't remember, this is from a patch I wrote a long time ago... We can just leave the current doc-string (though it doesn't follow the "start with one short line" principle):

Make an iterator that returns consecutive keys and groups from the iterable. If the key function is not specified or is None, the element itself is used for grouping.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this summary line is not very informative.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely agree that "returns consecutive keys and groups" isn't informative without more context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion (sans example):

Return an iterator of groups of values as (key, group) pairs.

This groups together consecutive values from the given iterator. For each group, a (key, group) 2-tuple is yielded, where ``key'' is the common key value for the group, and ``group'' is an iterator of the values in the group.

Copy link
Contributor Author

@taleinat taleinat Oct 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised suggestion:

Return an iterator of groups of items as (key, group) pairs.

This groups together consecutive items from the given iterator. Each group contains items for which the key gives the same result.

For each group, a (key, group) 2-tuple is yielded, where "key" is the common key value for the group and "group" is an iterator of the items in the group.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks better to me. But I'm not an English expert, and in any case the last word on to @rhettinger.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the doc-string to match my latest suggestion. Will await review by @rhettinger.

/*[clinic input]
itertools.groupby.__reduce__ as groupby_reduce

lz: self(type="groupbyobject *")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it is better to rename the parameter than use the self converter. This increases the size of the patch, but in long term it makes the code clearer.

The same for other methods below.

But ask @rhettinger first. His can have different preferences.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll wait for @rhettinger's input on this.

@@ -137,8 +185,17 @@ groupby_next(groupbyobject *gbo)
return r;
}

/*[clinic input]
itertools.groupby.__reduce__ as groupby_reduce
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is groupby_reduce used anywhere in non-generated code? If no, it is better to keep the name generated by Argument Clinic.

The same for other methods below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should also remove the as for the new functions, e.g. itertools.groupby.new as groupby_new?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

groupby_new and groupby_new__doc__ are used in the initializer of groupby_type.

@classmethod
itertools.starmap.__new__ as starmap_new

function: object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameters should be positional-only.

@classmethod
itertools.count.__new__ as count_new

start: object = NULL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    start as long_cnt: object(c_default="NULL") = 0
    step as long_step: object(c_default="NULL") = 1

* drop all "as" clauses for function names, renaming the relevant C
  functions instead
* count.__new__: add missing default values for arguments and rename
  the argument variables back to those in the pre-AC code
* starmap.__new__: properly define the arguments as positional-only
all have the same key. The common key for the group is the first item in the
pair.

Example:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before adding an example, I would try adding a clear summary of the class. Why did you remove the original summary, “an iterator that returns consecutive keys and groups”?

reduce_doc},
{NULL, NULL} /* sentinel */
};

PyDoc_STRVAR(starmap_doc,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems no longer used

Create a starmap object.

Return an iterator whose values are returned from the function evaluated
with a argument tuple taken from the given iterable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with argument tuples


Create a chain object.

Alternate chain() contructor taking a single iterable argument
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling: constructor

(Also, alternative is more correct than alternate outside the US.)


ro: self(type="repeatobject *")

Private method returning an estimate of len(list(it)).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren’t ro and it the same parameter?

Copy link
Contributor Author

@taleinat taleinat Oct 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but ro isn't included in the doc-string and would be unclear to users IMO. Suggestions for something more informative than it are welcome! (self?)

Currently this gives:

>>> help(itertools.repeat.__length_hint__)
Help on method_descriptor:

__length_hint__(self, /)
    Private method returning an estimate of len(list(it)).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use self?

@@ -4541,6 +4045,990 @@ are exhausted, the fillvalue is substituted in their place. The fillvalue\n\
defaults to None or can be specified by a keyword argument.\n\
");


/* including clinic output must come after typdefs and PyObject type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typedefs?

@rhettinger rhettinger self-assigned this Oct 29, 2017
@rhettinger
Copy link
Contributor

rhettinger commented Oct 29, 2017

Wow, this is a stunningly massive diff (+2,890 −1,274) to a file that was only 4,684 lines to begin with. I didn't expect so much code to be moved into a .h file (which seems unwise to me and will complicate my ability to understand the code). Also, I only expected the constructor calls to be given to AC, not all of the internal reduce/setstate methods and whatnot.

That said, thank you for the effort. I start working the diffs and it may take a while.

@taleinat
Copy link
Contributor Author

taleinat commented Jun 12, 2018

I've created another PR, #4170, with a small subset of these changes (just itertools.groupby). I suggest starting with getting that sorted out, and then deciding if and how to continue with the rest of the module.

Copy link
Contributor

@rhettinger rhettinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed, fixed, and applied pr #4170. Please use that as a model for improving this patch.

In particular, please don't move the PyTypeObject struct definitions to the bottom. This will greatly minimize the code churn and will keep the type structs right by beside the related code (which helps maintainability).

Also, just convert the primary documented interfaces. No need for setstate, reduce, etc. Please respect the existing docstrings as much as possible. The goal is to make signatures visible, not to rewrite the documentation. That said, if you find something that is unclear or incorrect, please point it out and I will take a look.

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@taleinat
Copy link
Contributor Author

@rhettinger, will do.

If you prefer, I will gladly continue to split this out into small PRs rather than one large one. Just let me know your preference.

@rhettinger
Copy link
Contributor

rhettinger commented Sep 10, 2018

One big patch will be fine. Following the pattern of PR #4170 will give a much more compact and reviewable single patch.

@taleinat
Copy link
Contributor Author

@rhettinger

please don't move the PyTypeObject struct definitions to the bottom

Some things will have to be moved to the top/bottom due to needing to appear before/after the line #include "clinic/itertoolsmodule.c.h". This includes, for example, the PyTypeObject declarations, which must come before that line.

If this is a deal-breaker, please let me know ASAP.

@serhiy-storchaka
Copy link
Member

Can forward declarations help?

@taleinat
Copy link
Contributor Author

Can forward declarations help?

I'm working on this now and so far it seems that only the static PyTypeObject declarations need to appear at the top. If @rhettinger prefers, we can duplicate them, keeping them in their current positions as well as at the top of the file before the #include.

@taleinat
Copy link
Contributor Author

See new PR #9164 which is a complete reworking of this as per @rhettinger's directions.

@rhettinger
Copy link
Contributor

This PR was superseded by PR 9164. Marking as closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants