An open-source framework used for processing and storing large data sets in a distributed computing environment.