-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.xml
41 lines (32 loc) · 1.95 KB
/
index.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>@amuraru</title>
<link>https://amuraru.github.io/index.html</link>
<description>Recent content on @amuraru</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<copyright><a href="https://creativecommons.org/licenses/by-nc/4.0/" target="_blank" rel="noopener">CC BY-NC 4.0</a></copyright>
<atom:link href="https://amuraru.github.io/index.xml" rel="self" type="application/rss+xml" />
</channel>
</rss>pe="application/rss+xml" />
<item>
<title>About</title>
<link>https://amuraru.github.io/about/</link>
<pubDate>Sat, 25 May 2019 00:15:06 +0300</pubDate>
<guid>https://amuraru.github.io/about/</guid>
<description>Adrian Muraru @adobe
Open Source enthusiast</description>
</item>
<item>
<title>Spark RDD Shuffle internals</title>
<link>https://amuraru.github.io/posts/2019/05/spark-rdd-shuffle-internals/</link>
<pubDate>Fri, 24 May 2019 23:45:10 +0300</pubDate>
<guid>https://amuraru.github.io/posts/2019/05/spark-rdd-shuffle-internals/</guid>
<description>group-by is using an aggregator to combineValuesByKey At the DAG level configuration: See PairRDDFunctions.scala#L503-L505
val createCombiner = (v: V) =&gt; CompactBuffer(v) val mergeValue = (buf: CompactBuffer[V], v: V) =&gt; buf += v val mergeCombiners = (c1: CompactBuffer[V], c2: CompactBuffer[V]) =&gt; c1 ++= c2 which eventually calls: PairRDDFunctions.combineByKeyWithClassTag
val aggregator = new Aggregator[K, V, C]( self.context.clean(createCombiner), self.context.clean(mergeValue), self.context.clean(mergeCombiners)) which in turn, boils down to:
new ShuffledRDD[K, V, C](self, partitioner) .setSerializer(serializer) .</description>
</item>
</channel>
</rss>