Readable Scala Code in Apache Spark (4 attempts)

Maciej Szymczyk
3 min readOct 31, 2020

Jupyter and Apache Zeppelin is a good place to experiment with data. Unfortunately, the specifics of notebooks do not encourage to organize the code, including its decomposition and readability. We can copy cells to Intellij IDEA and build JAR, but the effect will not be stunning. We can copy cells to Intellij IDEA and build JAR, but the effect will not be stunning. In this article you will learn how to make more readable Scala Apache Spark code in Intellij IDEA.

0. The base code

It is a simple application which:

  • downloads groceries data from a file;
  • filters fruits;
  • normalizes names;
  • calculates the quantity of each fruit.

1. Extract Methods

Let’s use the power of IDE, more precisely the Extract Method. It allows you to easily create a method from a selected piece of code. This way, let’s try to create methods corresponding to each step in the application.

It doesn’t work!?

--

--

Maciej Szymczyk
Maciej Szymczyk

Written by Maciej Szymczyk

Software Developer, Big Data Engineer, Blogger (https://wiadrodanych.pl), Amateur Cyclists & Triathlete, @maciej_szymczyk

Responses (3)