Skip to content

urllib3 connection pool full using messaging.send_each_for_multicast() #712

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
filippotp opened this issue Aug 9, 2023 · 26 comments
Open
Assignees

Comments

@filippotp
Copy link

  • Operating System version: Ubuntu 22.04 (Heroku)
  • Firebase SDK version: 6.2.0
  • Firebase Product: FCM
  • Python version: 3.11.4
  • Pip version: 23.2.1

I'm using FCM to send multicast messages from my Django app running on Heroku.

Since messaging.send_muticast() function is deprecated, I changed my code by using messaging.send_each_for_muticast(), as stated in the deprecation warning.

After pushing the new code to production, I often see multiple warnings in the Heroku application logs when the app sends messages: WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: fcm.googleapis.com. Connection pool size: 10

Editing my code back to using messaging.send_muticast() seems to solve the issue.

@google-oss-bot
Copy link

I couldn't figure out how to label this issue, so I've labeled it for a human to triage. Hang tight.

@Doris-Ge
Copy link
Contributor

Hi @filippotp,

Thanks for reporting the issue!

According to the answers in https://stackoverflow.com/questions/53765366/urllib3-connectionpool-connection-pool-is-full-discarding-connection, seeing WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: fcm.googleapis.com. Connection pool size: 10 does not mean that there is any problem with messaging.send_each_for_multicast(). As long as messaging.send_each_for_multicast() returns a BatchResponse without any failures in it, then the multicast_message should be sent to all the tokens successfully.

Here's the reason why those warnings only show up in the logs when messaging.send_each_for_multicast() is used:

Unlike messaging.send_multicast, the underlying implementation of messaging.send_each_for_multicast() uses concurrent.futures.ThreadPoolExecutor to start multiple threads to send to tokens via urllib3 in parallel. We set the maximum number of threads that ThreadPoolExecutor can start, that is max_workers, to the number of the tokens in the multicast_message. That means if a multicast_message contains 50 tokens, then ThreadPoolExecutor may start at most 50 threads. However, the maxsize for urllib3.connectionpool is not configured so it is the default value, 10. Then https://stackoverflow.com/a/66671026 helps explain everything:

For example, if your maxsize is 10 (the default when using urllib3 via requests), and you launch 50 requests in parallel, those 50 connections will be performed at once, and after completion only 10 will remain in the pool while 40 will be discarded (and issue that warning).

So if you only send multicast_message with no more than 10 tokens, I think the warning logs should be gone. However, the warning logs don't really matter. messaging.send_each_for_multicast() should still work properly regardless of the warnings. We may optimize the implementation of messaging.send_each_for_multicast() to start less threads in the future and may be able to get rid of those warnings, but before that happens, please continue migrating to messaging.send_each_for_multicast().

@mortenthansen
Copy link

Any updates on letting messaging.send_each and messaging.send_each_for_multicast start less threads? Its not obvious to me why starting a thread per token in the multicast message is a good idea. The number of tokens that one wants to send the message to most likely has nothing to do with the desire for concurrency (which depends on system resources, etc.)

Note that calling messaging.send_each_for_multicast multiple times (which is what I do now) is a sub-optimal solution.

Maybe the SDK can be extended to allow us to configure the max_workers to use when calling messaging.send_each and messaging.send_each_for_multicast?

@Klemenceo
Copy link

+1 on @mortenthansen comment, we got bit by this, sending push by batches of 500 messages, only to realise len(input) was used as the pool_size for the ThreadPool. I really can't figure out a good rationale for this. Spawning 500 threads (which I don't think will happen since python ThreadPool reuses free ones before spawning new ones), isn't cost free and at the very least should be documented as such.

@milemik
Copy link

milemik commented Jul 26, 2024

Hi @filippotp,

Thanks for reporting the issue!

According to the answers in https://stackoverflow.com/questions/53765366/urllib3-connectionpool-connection-pool-is-full-discarding-connection, seeing WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: fcm.googleapis.com. Connection pool size: 10 does not mean that there is any problem with messaging.send_each_for_multicast(). As long as messaging.send_each_for_multicast() returns a BatchResponse without any failures in it, then the multicast_message should be sent to all the tokens successfully.

Here's the reason why those warnings only show up in the logs when messaging.send_each_for_multicast() is used:

Unlike messaging.send_multicast, the underlying implementation of messaging.send_each_for_multicast() uses concurrent.futures.ThreadPoolExecutor to start multiple threads to send to tokens via urllib3 in parallel. We set the maximum number of threads that ThreadPoolExecutor can start, that is max_workers, to the number of the tokens in the multicast_message. That means if a multicast_message contains 50 tokens, then ThreadPoolExecutor may start at most 50 threads. However, the maxsize for urllib3.connectionpool is not configured so it is the default value, 10. Then https://stackoverflow.com/a/66671026 helps explain everything:

For example, if your maxsize is 10 (the default when using urllib3 via requests), and you launch 50 requests in parallel, those 50 connections will be performed at once, and after completion only 10 will remain in the pool while 40 will be discarded (and issue that warning).

So if you only send multicast_message with no more than 10 tokens, I think the warning logs should be gone. However, the warning logs don't really matter. messaging.send_each_for_multicast() should still work properly regardless of the warnings. We may optimize the implementation of messaging.send_each_for_multicast() to start less threads in the future and may be able to get rid of those warnings, but before that happens, please continue migrating to messaging.send_each_for_multicast().

@filippotp does this means that send_each_for_multicast with 500 messages will send all push notifications, regardless the warnings (will not discard 490 messages out of 500 and send only 10 per batch)?

Thank you for your answer in advance :)

@filippotp
Copy link
Author

@milemik I think you are referring to @Klemenceo's comment.

Anyway, in my personal case the issue has never occurred again after updating firebase-admin to 6.3.0 and definitively migrating to messaging.send_each_for_multicast().

@milemik
Copy link

milemik commented Jul 26, 2024

@milemik I think you are referring to @Klemenceo's comment.

Anyway, in my personal case the issue has never occurred again after updating firebase-admin to 6.3.0 and definitively migrating to messaging.send_each_for_multicast().

Yes, sorry questtion was for @Doris-Ge :)
I have version 6.5.0 and I can see these warnings, in my logs of django (Celery) project. Just want to understand if all of my push notifications will be sent regardless of this warning.

Thank you 😄

@ankit-wadhwani-hb
Copy link

I have version 6.5.0 and I can see these warnings, in my logs of django (Celery) project, also because of spawning 500 threads the workers consumes 100% CPU machine which affects other workers.

What can be the solution for this? @milemik @filippotp

@milemik
Copy link

milemik commented Aug 27, 2024

I have version 6.5.0 and I can see these warnings, in my logs of django (Celery) project, also because of spawning 500 threads the workers consumes 100% CPU machine which affects other workers.

What can be the solution for this? @milemik @filippotp

Hi @ankit-wadhwani-hb to be honest I didn't benchmark this, but it does make sense that threads will use 100% of CPU, maybe this is expected. What matters is that CPU goes back to normal after sending push notifications,and hopefully no notifications will be lost. I will get back to you when I test it. Thank you for this notice!

@Jay2109
Copy link

Jay2109 commented Sep 16, 2024

I have version 6.5.0 and I can see these warnings, in my logs of django (Celery) project, also because of spawning 500 threads the workers consumes 100% CPU machine which affects other workers.

What can be the solution for this? @milemik @filippotp

This similar thing is happening with me , the cpu utilisation goes to 99-100 % and as soon as I restart the process the cpu drops significantly.

@Jay2109
Copy link

Jay2109 commented Sep 17, 2024

I have version 6.5.0 and I can see these warnings, in my logs of django (Celery) project, also because of spawning 500 threads the workers consumes 100% CPU machine which affects other workers.
What can be the solution for this? @milemik @filippotp

This similar thing is happening with me , the cpu utilisation goes to 99-100 % and as soon as I restart the process the cpu drops significantly.

Quick update if I change the batch size from 500 to 10 it works properly but not sure if it is the right way to do it

@milemik
Copy link

milemik commented Sep 17, 2024

I have version 6.5.0 and I can see these warnings, in my logs of django (Celery) project, also because of spawning 500 threads the workers consumes 100% CPU machine which affects other workers.
What can be the solution for this? @milemik @filippotp

This similar thing is happening with me , the cpu utilisation goes to 99-100 % and as soon as I restart the process the cpu drops significantly.

Quick update if I change the batch size from 500 to 10 it works properly but not sure if it is the right way to do it

@Jay2109 Well I don't think this is a good fix... This means that more more resources we need to use when sending push notifications (more tasks will be triggered at least in my logic 😄 ).

@Jay2109
Copy link

Jay2109 commented Sep 17, 2024

I have version 6.5.0 and I can see these warnings, in my logs of django (Celery) project, also because of spawning 500 threads the workers consumes 100% CPU machine which affects other workers.
What can be the solution for this? @milemik @filippotp

This similar thing is happening with me , the cpu utilisation goes to 99-100 % and as soon as I restart the process the cpu drops significantly.

Quick update if I change the batch size from 500 to 10 it works properly but not sure if it is the right way to do it

@Jay2109 Well I don't think this is a good fix... This means that more more resources we need to use when sending push notifications (more tasks will be triggered at least in my logic 😄 ).

What solution have you implemented currently ?
In my case I am sending many notifications, they are not reaching android devices on time.

@ankit-wadhwani-hb
Copy link

I have version 6.5.0 and I can see these warnings, in my logs of django (Celery) project, also because of spawning 500 threads the workers consumes 100% CPU machine which affects other workers.
What can be the solution for this? @milemik @filippotp

This similar thing is happening with me , the cpu utilisation goes to 99-100 % and as soon as I restart the process the cpu drops significantly.

Quick update if I change the batch size from 500 to 10 it works properly but not sure if it is the right way to do it

@Jay2109 Well I don't think this is a good fix... This means that more more resources we need to use when sending push notifications (more tasks will be triggered at least in my logic 😄 ).

What solution have you implemented currently ? In my case I am sending many notifications, they are not reaching android devices on time.

@Jay2109 @milemik - any other solution? currently I am doing batching of 10 messages at a time otherwise it consumes the full server cpu + memory and hampers other services.

@Jay2109
Copy link

Jay2109 commented Sep 18, 2024

The batch of 10 also doesn't work for a longer duration after sometime the cpu increases.
Then I restart the process again .
This is happening for me

@milemik
Copy link

milemik commented Sep 18, 2024

I have version 6.5.0 and I can see these warnings, in my logs of django (Celery) project, also because of spawning 500 threads the workers consumes 100% CPU machine which affects other workers.
What can be the solution for this? @milemik @filippotp

This similar thing is happening with me , the cpu utilisation goes to 99-100 % and as soon as I restart the process the cpu drops significantly.

Quick update if I change the batch size from 500 to 10 it works properly but not sure if it is the right way to do it

@Jay2109 Well I don't think this is a good fix... This means that more more resources we need to use when sending push notifications (more tasks will be triggered at least in my logic 😄 ).

What solution have you implemented currently ? In my case I am sending many notifications, they are not reaching android devices on time.

@Jay2109 @milemik - any other solution? currently I am doing batching of 10 messages at a time otherwise it consumes the full server cpu + memory and hampers other services.

Well @ankit-wadhwani-hb to be honest not sure... I hope that developers of this library are aware of this issue and I expect to hear some answer from them.
How many push notifications you sent? It really depends on where you are hosting and how you are creating tasks...

@ankit-wadhwani-hb
Copy link

@milemik - Currently there are 20 scheduled bulk notifications mostly that go daily to 100,000 users, I am using celery with python to push the notifications in SQS queue and running a supervisor worker to consume the messages from the queue to send the notifications.

@milemik
Copy link

milemik commented Sep 18, 2024

@milemik - Currently there are 20 scheduled bulk notifications mostly that go daily to 100,000 users, I am using celery with python to push the notifications in SQS queue and running a supervisor worker to consume the messages from the queue to send the notifications.

Ok, and how much push notifications you send in one task? If you are sending push notification with one task, maybe you can do some optimisation and not sending 100.000 in one celery task, but split it into more smaller tasks.
PS. its really hard to give you recommendation without some code snippet 😄

Anyway lets wait some answer from someone in development team to give us some answers 😃

@Jay2109
Copy link

Jay2109 commented Sep 18, 2024

I am not able to send 500 in a batch after sometime it is eating up my cpu

@DanKamilov
Copy link

Any news? Problem with send_each... send_multicast worked well!!

@Jay2109
Copy link

Jay2109 commented Dec 6, 2024

if any update or if it is fixed in Beta or in some other language please update here ?

@milemik
Copy link

milemik commented Dec 6, 2024

I think this is more urllib constraint... and there are some ways to update this number... I would like to here some answer from maintainers... 😅

@joaqo
Copy link

joaqo commented Feb 13, 2025

2025 and the problem persists...

@davidemerritt
Copy link

+1 - I think it is a very small change to allow the caller to set the max_workers to something that makes sense for the resources that it is running on. Any update @Doris-Ge ?

@horrhamid
Copy link

Hi, I had this issue too.

To work around it, I initially handled the requests manually. I built an async solution using httpx.AsyncClient and sent Firebase notification requests in batches. This significantly reduced CPU usage and improved the sending speed.

However, there's now a better solution available directly in the Firebase Admin SDK. A new async feature has been merged (but not officially released yet) that adds a send_each_async method. You can use it by installing the SDK from the specific commit like this:
firebase-admin @ git+https://github.com/firebase/firebase-admin-python.git@a4bc029#egg=firebase-admin
Then, you can send messages asynchronously like this:
messages = [ messaging.Message( data=message.data, notification=message.notification, android=message.android, webpush=message.webpush, apns=message.apns, fcm_options=message.fcm_options, token=token, ) for token in message.tokens ] response = await messaging.send_each_async(messages=messages, dry_run=False)
but remember to send 500 or less messages in each batch

@lahirumaramba
Copy link
Member

@horrhamid you are right! We are actively working on optimizing the current send_each APIs. We will track the progress in here #788
Stay tuned!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests