Skip to content
This repository has been archived by the owner on Mar 5, 2024. It is now read-only.

Commit

Permalink
Note drawback of current scheduling strategy in pmap_to_bag
Browse files Browse the repository at this point in the history
  • Loading branch information
xandkar committed May 24, 2022
1 parent e913db9 commit 97fe3c3
Showing 1 changed file with 14 additions and 0 deletions.
14 changes: 14 additions & 0 deletions src/data/data_stream.erl
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,20 @@ pmap_to_bag(T, F, J) when is_function(T), is_function(F), is_integer(J), J > 0 -
end,
Producer =
fun () ->
%% XXX Producer is racing against consumers.
%%
%% This hasn't (yet) caused a problem, but in theory it is
%% bad: producer is pouring into the scheduler's queue as
%% fast as possible, potentially faster than consumers can
%% pull from it, so heap usage could explode.
%%
%% Solution ideas:
%% A. have the scheduler call the producer whenever more
%% work is asked for, but ... that can block the
%% scheduler, starving consumers;
%% B. produce in (configurable size) batches, pausing
%% production when batch is full and resuming when not
%% (this is probably the way to go).
ok = iter(fun (X) -> SchedPid ! {SchedID, producer_output, X} end, T)
end,
Ys =
Expand Down

0 comments on commit 97fe3c3

Please sign in to comment.