Skip to content

Various HashMap optimisations #337

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jul 22, 2021
Merged

Various HashMap optimisations #337

merged 7 commits into from
Jul 22, 2021

Conversation

utdemir
Copy link
Contributor

@utdemir utdemir commented Jul 12, 2021

Just some minor optimisations for our HashMap implementation. It contains
a few unrelated small changes, so reviewing each individual commit might
be easier.

When I compare this against Data.HashMap.Strict, the results change based
on the input (likely based on the hash distribution of the key). I need to investigate
this more. Just as reference, here're the functions I am comparing (Key is just a
newtype wrapper around Int):

   linear :: [Key] -> HashMap.Linear.HashMap Key () %1-> ()
   linear inp hm = go inp hm `Linear.lseq` ()
    where
     go :: [Key] -> HashMap.Linear.HashMap Key () %1-> HashMap.Linear.HashMap Key ()
     go [] h = h
     go (x:xs) h = go xs Linear.$! HashMap.Linear.insert x () h

   dataHashMap :: [Key] -> Data.HashMap.Strict.HashMap Key () -> ()
   dataHashMap inp hm = go inp hm `seq` ()
    where
     go :: [Key] -> Data.HashMap.Strict.HashMap Key () -> Data.HashMap.Strict.HashMap Key ()
     go [] h = h
     go (x:xs) h = go xs $! Data.HashMap.Strict.insert x () h

I did not include the benchmark on this PR, since it looks a bit different
than our existing HashMap benchmarks. I have to combine them together at
one point.

utdemir added 5 commits July 8, 2021 13:44
It somehow ends up being faster, since otherwise we frequently allocate
boxed integers from the unpacked field.
It is accessed pretty often, so it is faster to have it ready.
Copy link
Member

@aspiwack aspiwack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that the next step is committing these benchmarks.

@facundominguez
Copy link
Member

facundominguez commented Jul 12, 2021

Hello! Just a random thought, are alter and alterF supposed to be inlined? Otherwise, the allocation of the Maybe value returned by the callback can't be eliminated. insert in unordered-containers isn't implemented in terms of alter, at least.

@utdemir
Copy link
Contributor Author

utdemir commented Jul 12, 2021

are alter and alterF supposed to be inlined? Otherwise, the allocation of the Maybe value returned by the callback can't be eliminated.

This was a good point, I think they should be inlined. I can observe from the core output that both the callback passed to alterF from alter and eg. callback pased to alter from insert disappear. Unfortunately they do not correspond to a measurable speed increase on the insertion benchmarks, I think this is because they are only called once per insertion, but our bottleneck is on tryInsertAtIndex and probeFrom functions which are called multiple times per insertion.

But they make sense and make the core output prettier, so I commited those.

@utdemir utdemir force-pushed the ud/optimise-hashmaps branch from 75192a2 to f3fd127 Compare July 22, 2021 04:04
@utdemir utdemir merged commit ed59552 into master Jul 22, 2021
@utdemir utdemir deleted the ud/optimise-hashmaps branch July 22, 2021 04:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants