hong_devlog

Chapter 5. Basic Structured Operations

Spark: The Definitive Guide ๋‚ด์šฉ ์ •๋ฆฌ Schema DataFrame์˜ column name๊ณผ data type์„ ์ •์˜ Data source์—์„œ schema๋ฅผ ์–ป๊ฑฐ๋‚˜ ์ง์ ‘ ์ •์˜ ๊ฐ€๋Šฅ ์—ฌ๋Ÿฌ ๊ฐœ์˜ type filed๋กœ ๊ตฌ์„ฑ๋œ object Spark๋Š” runtime์— data type์ด schema์˜ data type๊ณผ ์ผ์น˜ํ•˜์ง€ ์•Š์œผ๋ฉดโ€ฆ

Chapter 4. Structured API Overview

Spark: The Definitive Guide ๋‚ด์šฉ ์ •๋ฆฌ Overview Apache Spark Community๋Š” 2.0 version์„ ์ถœ์‹œํ•˜๋ฉด์„œ structured API๋ฅผ ๋„์ž…ํ–ˆ๋‹ค. 1.x ๋ฒ„์ „์—์„œ๋Š” RDD์™€ ๊ฐ™์€ lower-level API๋ฅผ ํ™œ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ์ฃผ๋ฅผ ์ด๋ฃจ์—ˆ์œผ๋‚˜, 2.0 ์ถœ์‹œ ์ดํ›„์—๋Š” ์ž๋™ํ™”๋œ ์ตœ์ ํ™” ๊ธฐ๋Šฅ๊ณผ ์žฅ์•  ๋Œ€์‘ ๋Šฅ๋ ฅ์„ ์ œ๊ณตโ€ฆ

Chapter 3. A Tour of Spark's Toolset

Spark: The Definitive Guide ๋‚ด์šฉ ์ •๋ฆฌ Production Application Spark๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด big data program์„ ์‰ฝ๊ฒŒ ๊ฐœ๋ฐœ ํ•  ์ˆ˜ ์žˆ๋‹ค. spark-submit ๋Œ€ํ™”ํ˜• shell์—์„œ ๊ฐœ๋ฐœํ•œ program์„ production application์œผ๋กœ ์‰ฝ๊ฒŒ ์ „ํ™˜ ๊ฐ€๋Šฅ application code๋ฅผ cluster์—โ€ฆ

Chapter 2. A Gentle Introduction to Spark

Spark: The Definitive Guide ๋‚ด์šฉ ์ •๋ฆฌ Cluster ์—ฌ๋Ÿฌ ์ปดํ“จํ„ฐ์˜ ์ž์›์„ ๋ชจ์•„ ํ•˜๋‚˜์˜ ์ปดํ“จํ„ฐ์ฒ˜๋Ÿผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋งŒ๋“ ๋‹ค. Cluster์—์„œ ์ž‘์—…์„ ์กฐ์œจํ•  ์ˆ˜ ์žˆ๋Š” framework๊ฐ€ ํ•„์š”ํ•œ๋ฐ, spark๊ฐ€ ๊ทธ๋Ÿฐ ์—ญํ• ์„ ํ•˜๋Š” framework Spark Application Spark๋Š” ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ์ž์›์„ ํŒŒ์•…ํ•˜๊ธฐ ์œ„ํ•ด clusterโ€ฆ

Chapter 1. What is Apache Spark

Spark: The Definitive Guide ๋‚ด์šฉ ์ •๋ฆฌ Apache Spark ๋น…๋ฐ์ดํ„ฐ๋ฅผ ์œ„ํ•œ ํ†ตํ•ฉ(unified) ์ปดํ“จํŒ… ์—”์ง„๊ณผ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์ง‘ํ•ฉ ํด๋Ÿฌ์Šคํ„ฐ ํ™˜๊ฒฝ์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณ‘๋ ฌ๋กœ ์ฒ˜๋ฆฌํ•˜๋Š” ์˜คํ”ˆ์†Œ์Šค ์—”์ง„ Python, Java, Scala, R ์„ ์ง€์› Features ๋‹ค์–‘ํ•œ ๋ฐ์ดํ„ฐ ๋ถ„์„ ์ž‘์—…์„ ์ผ๊ด€์„ฑ ์žˆ๋Š” API๋กœ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ์„ค๊ณ„ (consโ€ฆ

Fragment Transaction

Transaction Runtime์—์„œ, ๋Š” ์œ ์ €์™€์˜ ์ธํ„ฐ๋ž™์…˜์— ๋Œ€ํ•œ ์‘๋‹ต์œผ๋กœ fragments๋ฅผ ์ถ”๊ฐ€, ์ œ๊ฑฐ, ๊ต์ฒด, ๋˜๋Š” ๋‹ค๋ฅธ ๋™์ž‘๋“ค์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ fragment ๋ณ€๊ฒฝ๋“ค์„ ์ด๋ผ๊ณ  ํ•˜๋ฉฐ, class์—์„œ ์ œ๊ณตํ•˜๋Š” API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ transaction ๋‚ด์—์„œ ์ˆ˜ํ–‰ํ•  ์ž‘์—…์„ ์ง€์ •ํ•  ์ˆ˜ ์žˆ๋‹ค. ํ•˜๋‚˜์˜ transaction์— ์—ฌ๋Ÿฌ ๊ฐœ์˜ actiโ€ฆ

Wildcards in generics

Wildcard: ? ๋ฌผ์Œํ‘œ(?)๋Š” generic programming์—์„œ wildcard๋กœ ์‚ฌ์šฉ๋œ๋‹ค. Unknown type์„ ๋‚˜ํƒ€๋‚ธ๋‹ค. Parameter, field, ์ง€์—ญ ๋ณ€์ˆ˜, return์˜ type ๋“ฑ๊ณผ ๊ฐ™์ด ๋‹ค์–‘ํ•œ ์ƒํ™ฉ์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค. Types of wildcards Upper Bounded Wildcards Collectiontype<โ€ฆ

Generics in Java

Generics Parameterized types Integer, String ๋“ฑ์˜ ํƒ€์ž…์„ methods, classes, interfaces์˜ parameter๋กœ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•จ์ด๋‹ค. Generics๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด, ๊ฐ๊ฐ ๋‹ค๋ฅธ data types์™€ ๋™์ž‘ํ•˜๋Š” class๋“ค์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค. Advantages Code Reuse: method, class, โ€ฆ

JNI (Java Native Interface)

JNI JVM์—์„œ ๋Œ์•„๊ฐ€๋Š” bytecode์™€ native code ์‚ฌ์ด์˜ bridge Application์ด ์™„์ „ํžˆ java๋กœ ์“ฐ์—ฌ์งˆ ์ˆ˜ ์žˆ์ง€๋งŒ, java ํ•˜๋‚˜๋กœ๋Š” application ์š”๊ตฌ์‚ฌํ•ญ์„ ์ถฉ์กฑ์‹œํ‚ค์ง€ ๋ชปํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ์žˆ๋‹ค. ์ด๋ ‡๊ฒŒ application์ด java๋งŒ์œผ๋กœ ์ž‘์„ฑ๋˜์ง€ ๋ชปํ•˜๋Š” ๊ฒฝ์šฐ, JNI์„ ์‚ฌ์šฉํ•˜์—ฌ Java native methods๋ฅผ ์ž‘์„ฑโ€ฆ

Inner Class, Anonymous Class

Inner Class (๋‚ด๋ถ€ ํด๋ž˜์Šค) ํด๋ž˜์Šค ์•ˆ์— ์žˆ๋Š” ํด๋ž˜์Šค Inner class์—์„œ๋Š” Outer class์˜ ๋ชจ๋“  method์™€ variables๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค. Private์œผ๋กœ ์ง€์ •๋œ ๊ฒƒ๋„ ์ ‘๊ทผ๊ฐ€๋Šฅํ•˜๋‹ค. ์„œ๋กœ ๋‹ค๋ฅธ ๋‚ด๋ถ€ ํด๋ž˜์Šค์—์„œ ๋˜‘๊ฐ™์€ interfaces๋ฅผ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, overriding methods๊ฐ€ ํ•„์š”ํ•œ ๊ฒฝ์šฐ ์œ ์šฉํ•˜๋‹ค. Eโ€ฆ