-
Notifications
You must be signed in to change notification settings - Fork 35
Load is distributed unequally between nodes #39
Comments
Regarding mapper, your observation is right, one mapper is working for one source reader on one node. We need to have a discussion regarding this problem. Btw, can I know what kinda job you are putting on mapper? Some nodes not executing anything seems like a bug in hash ring. But could you turn on TRACE log level on those nodes to double check it? If anything running on that node, you can easily tell it by tail log file. At the same time, I am going to run tests again our current hash ring. |
If your app has mappers only, and no updaters, the work packets will never be sent across the network. |
My guess is that your stream source is only accessible locally on one node (the node where all mappers are scheduled)? If you are using some distributed storage system as your source (e.g. HDFS, Amazon Simple Queue Service) you might be able to spread your source threads and corresponding mappers across your cluster. If your stream is only coming in on that one node, then it'd probably be worthwhile to consider having a darn-simple mapper that simply distributes your incoming events evenly to your other nodes, and on those nodes your original CPU-intensive mappers will be running. So it'd go like this, stream source --> darn-simple mapper --> {CPU-intensive mappers} |
It is not possible to write a simple mapper that will distribute events across nodes. This can only be done via an Updater as of now. Create a hash buckets of randomly generated keys and send them to their corresponding Updaters on different hosts. |
I tried to follow zheguang recommendation, but as ZohebV already said it did't work out. Both succeeding map task has been executed on the same node. However the one thing that changed is that all other nodes are executing my update task now. |
I replaced my mapper by an updater, doing the same job. Thus my topology looks like that: Source -> Updater_1(Do calculation) -> Updater_2(Aggregate results) Now my problem remains that some nodes aren't executing anything, but most of them are executing Updater_1 and Updater_2. |
@Teots Use more hash buckets and good hash functions/random number generators to ensure a more even distribution of keys |
I'm running a mupd8 application on 25 nodes for testing purposes at the moment and I recognized that all map instances are scheduled to only one of these 25 nodes. This results in very poor performance, because my map task are very CPU intensive. While most of the other nodes are executing update tasks, some of them aren't executing anything. I cannot find any reasons for that. Also, no exceptions appear in my logging files.
Further details about my application:
-the key of each tuple is selected randomly
-each node executes mupd8 with 4 worker threads
If you need more details, just let me know.
The text was updated successfully, but these errors were encountered: