Skip to content

Fix "I'm feeling lucky" search to include all crates #1251

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 21, 2021

Conversation

syphar
Copy link
Member

@syphar syphar commented Jan 17, 2021

This fixes the I'm feeling lucky search ( #789 ) to do an efficient search over all creates, not only the first 280 ones.

I was searching and testing through many different ways to get a random record, and found this to be the best for our dataset. (see also this stackoverflow answer for some overview).

I tested this locally with a dummy-dataset with 600 crates >100 stars, and out of 10k of these queries there were only <50 which needed a second try. In the statement execution time was <10ms all the time.

Other approaches felt too complex in code, were too slow, or were too slow randomly.

I'm happy to implement any needed changes.

In #789 @Kixiron was mentioning metrics to be added, while I'm not sure which metrics are needed.

-- this query too often.
-- it depends on the percentage of crates with > 100 stars
SELECT DISTINCT 1 + trunc(random() * params.max_id)::INTEGER AS id
FROM params, generate_series(1, 500)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard-coding 500 makes me a little nervous. Could we generate this live based on production info? Or does that slow down the query too much?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generating that value live slows down the query too much, I valued speed higher.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the number depends on the ratio of crates >100 stars

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while thinking about this, I have another idea (but not likely it's faster)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if it works out :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't help 😢 was too slow :)

@syphar
Copy link
Member Author

syphar commented Jan 18, 2021

@jyn514 thanks for the review! I'll check the comments in detail in the evening today

@Kixiron
Copy link
Member

Kixiron commented Jan 18, 2021

The metrics were mostly because I was curious, I wanted to put a counter on the I'm feeling lucky button that reported how many times it was actually used

Comment on lines 501 to 503
// since there is a chance of this query returning an empty result,
// we retry a certain amount of times, and log a warning.
for _ in 0..20 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems strange to use a fallible query, is there not a way to make it always work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kixiron I tried (I think) 5-10 different approaches, the ones that always work are either too slow or involve things like a recursive CTE (which is in in the end similar to retrying here, while this is more readable IMHO).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just ran some numbers: current query runs between 2 and 8 ms (~)
and delivers a random crate directly in most cases (50 misses for 10k queries),

the repetitions are more or less a safeguard.

Also I could not test on a production dataset so I don't know the amount of gaps in the id-span on the crate-table (and the production github stars without backfilling everything locally)

@syphar
Copy link
Member Author

syphar commented Jan 18, 2021

The metrics were mostly because I was curious, I wanted to put a counter on the I'm feeling lucky button that reported how many times it was actually used

@Kixiron wouldn't we get that from the access logs too? or don't we have these?

I'll check other examples and see if it's easy to add

@syphar
Copy link
Member Author

syphar commented Jan 18, 2021

@Kixiron @jyn514 I (hopefully) answered your questions and added two commits with the prosed log-statement changes and metrics.
I admit I was definitely focusing on query speed, while sacrificing beauty or (probably a little) readability.

When wanted I could likely produce something better readable, but slower ( ~100 or even 200 ms ) ,

or we could also just add the metrics to see if the feature is actually used or could be removed :)

Copy link
Member

@jyn514 jyn514 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't we get that from the access logs too? or don't we have these?

The access logs are not integrated with the grafana logging, and also don't work when running locally. Plus I prefer to have as little depending on nginx as possible because it's not tested at all and doesn't go through review.

or we could also just add the metrics to see if the feature is actually used or could be removed :)

I would be interested in that.

When wanted I could likely produce something better readable, but slower ( ~100 or even 200 ms ) ,

I would prefer to have the faster query I think, the searches take long enough already. Thanks for offering though :)

Also I could not test on a production dataset so I don't know the amount of gaps in the id-span on the crate-table (and the production github stars without backfilling everything locally)

Let me know how to test this, happy to runs some queries. I did run the 'redirect_to_random_query' once with EXPLAIN ANALYZE and it took less than a millisecond, great job :)

-- this query too often.
-- it depends on the percentage of crates with > 100 stars
SELECT DISTINCT 1 + trunc(random() * params.max_id)::INTEGER AS id
FROM params, generate_series(1, 500)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if it works out :)

@syphar
Copy link
Member Author

syphar commented Jan 19, 2021

or we could also just add the metrics to see if the feature is actually used or could be removed :)

I would be interested in that.

To validate what I understand: I would create a separate PR only adding the metrics. Then we watch how often this is actually used and we decide to drop or fix? @jyn514

Also I could not test on a production dataset so I don't know the amount of gaps in the id-span on the crate-table (and the production github stars without backfilling everything locally)

Let me know how to test this, happy to runs some queries. I did run the 'redirect_to_random_query' once with EXPLAIN ANALYZE and it took less than a millisecond, great job :)

I just wrote a small (ugly) tool: https://gist.github.com/syphar/949cdb2ee297ca5695bad179cfd99a6e

With that you should be able to play around with the numbers :)

@jyn514
Copy link
Member

jyn514 commented Jan 19, 2021

To validate what I understand: I would create a separate PR only adding the metrics. Then we watch how often this is actually used and we decide to drop or fix? @jyn514

I think this is a good change as-is, I would be happy merging now and we can add the metrics PR later.

With that you should be able to play around with the numbers :)

Thanks! I tested this in prod and it never came up missing:

records: 10000
with result: 10000
without result: 0

So I don't think we have to worry.

Is this ready to merge? It looks good from my end :)

@syphar
Copy link
Member Author

syphar commented Jan 20, 2021

To validate what I understand: I would create a separate PR only adding the metrics. Then we watch how often this is actually used and we decide to drop or fix? @jyn514

I think this is a good change as-is, I would be happy merging now and we can add the metrics PR later.

After the comment by @Kixiron I already added metrics, as I understood how it would have to be done (see e9bfc47 ) .

Is this ready to merge? It looks good from my end :)

From my side It would be ready, except we want more tests, different metrics, or any other change

@syphar
Copy link
Member Author

syphar commented Jan 20, 2021

@jyn514 just discovered by accident that my testing-gist used an even smaller number or random IDs (100) and still always delivered results.

But 500 is fast enough, so I guess we're fine with the current number for some time.

@jyn514 jyn514 merged commit e9f300a into rust-lang:master Jan 21, 2021
@syphar syphar deleted the fix-im-feeling-lucky branch January 21, 2021 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants