Monday, July 21, 2014
Spark good readings
1. Spark and Shark(Slideshare)
2. Apache Spark(Slideshare)
3. Lightening fast big data analytics using apache spark(Slideshare)
4. Spark Documentation
5. Putting Spark to Use: Fast In-Memory Computing for Your Big Data Applications
6. RDD
7. Persisting RDD in Spark
8. Scala on Spark function samples
9. How-to: Tune Your Apache Spark Jobs (Part 1)
10. How-to: Tune Your Apache Spark Jobs (Part 2)
11. Apache Spark Resource Management and YARN App Models
==
Labels:
Good Reading,
Spark
Subscribe to:
Post Comments (Atom)
Popular Posts
-
Many commands can check the memory utilization of JAVA processes, for example, pmap, ps, jmap, jstat. What are the differences? Before we ...
-
Hive table contains files in HDFS, if one table or one partition has too many small files, the HiveQL performance may be impacted. Sometime...
-
This article shows a sample code to load data into Hbase or MapRDB(M7) using Scala on Spark. I will introduce 2 ways, one is normal load us...
-
Hive is trying to embrace CBO(cost based optimizer) in latest versions, and Join is one major part of it. Understanding join best practices ...
-
This is a cookbook for scala programming. 1. Define a object with main function -- Helloworld. object HelloWorld { def main(args: Array...
-
Goal: How to build and use parquet-tools to read parquet files. Solution: 1. Download and Install maven. Follow below link: http://...
-
Goal: This article explains the configuration parameters for Oozie Launcher job.
-
Goal: How to control the number of Mappers and Reducers in Hive on Tez.
-
Goal: This article research on how Spark calculates the Decimal precision and scale using GPU or CPU mode. Basically we will test Addition/S...
-
Env: PostgreSQL or Greenplum Symptom: COPY from a file into a table fails with error: ERROR: invalid byte sequence for encoding ...
No comments:
Post a Comment