Have something interesting to share with Java Eastern Europe community?
Become a speaker now
Aleksey Slyusarenko

Aleksey Slyusarenko

Grammarly, Ukraine

Research Engineer, 4 years of machine learning and NLP experience, 10 years of computational algorithm development, programming and math competitions winner (Google Code Jam – 11th absolute place, IMC – 1st prize).

Speaker's activity
Petabyte-Scale Text Processing with Spark
May 20th
14:30-15:15
Talk
Russian

At Grammarly, we have long used Amazon EMR with Hadoop and Pig in support of our big data processing needs. However, we were really excited about the improvements that the maturing Apache Spark offers over Hadoop and Pig, and so set about getting Spark to work with our petabyte text data set. This talk describes the challenges we had in the process and a scalable working setup of Spark that we have discovered as a result.

Slides
Video