osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Spark-optimized Shuffle (SOS) any update?


This is an awesome news! Is there anything we can do to help? We are currently facing huge performance penalties due this issue.

Thanks,
David

On Wed, Dec 19, 2018 at 5:43 PM Ilan Filonenko <if56@xxxxxxxxxxx> wrote:
Recently, the community has actively been working on this. The JIRA to follow is: 
https://issues.apache.org/jira/browse/SPARK-25299. A group of various companies including Bloomberg and Palantir are in the works of a WIP solution that implements a varied version of Option #5 (which is elaborated upon in the google doc linked in the JIRA summary). 

On Wed, Dec 19, 2018 at 5:20 AM <marek-simunek@xxxxxxxxx> wrote:
Hi everyone,
    we are facing same problems as Facebook had, where shuffle service is a bottleneck. For now we solved that with large task size (2g) to reduce shuffle I/O.

I saw very nice presentation from Brian Cho on Optimizing shuffle I/O at large scale[1]. It is a implementation of white paper[2].
Brian Cho at the end of the lecture kindly mentioned about plans to contribute it back to Spark[3]. I checked mailing list and spark JIRA and didn't find any ticket on this topic.

Please, does anyone has a contact on someone from Facebook who could know more about this? Or are there some plans to bring similar optimization to Spark?

[1] https://databricks.com/session/sos-optimizing-shuffle-i-o
[2] https://haoyuzhang.org/publications/riffle-eurosys18.pdf
[3] https://image.slidesharecdn.com/5brianchoerginseyfe-180613004126/95/sos-optimizing-shuffle-io-with-brian-cho-and-ergin-seyfe-30-638.jpg?cb=1528850545