Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
Trying to figure out best practice here as I am learning celery.
I want to fan out a bunch of small jobs in another job. I understand groups is a great way to get the results from a bunch of enqueued smaller jobs, but what if I don't care about results? I just want to split a big job into a bunch of smaller tasks. Is calling delay in a loop the same thing in this case?
Is it still best practice to use groups when wanting to fan out, and queue up a bunch of smaller jobs in a parent job? Below is some pseudo code as an example.
@app.task
def call_job(job_args):
groups([for small_job.s(x) for x in job_args]).delay()
@app.task
def small_job(x):
# do something
@app.task
def call_job(job_args):
for x in job_args:
small_job.delay(x)
@app.task
def small_job(x):
# do something
Use group
when you want to know that all smaller tasks finished (because you want to see their result together).
Use group
when another action/task should happen only after all tasks finished (chord).
Behind the scene, someone (Celery backend) needs to take care of managing the finished tasks so it can answer whether the group results are ready()
.
The bottom line, if you don't care about your tasks' return value and there is no dependency between finished them all and a follow-up action - go with the for
loop.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.