Conversation
| <?xml version="1.0" encoding="UTF-8"?> | ||
| <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" | ||
| xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> | ||
| xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> |
There was a problem hiding this comment.
This line is only a change in spacing/indentation and should be restored to its original form.
There was a problem hiding this comment.
formatting after the change is proper, previously was a bad indentation
There was a problem hiding this comment.
I see, it's not a big deal when there is only one, but if there were a lot of changes like this, they would become "noise" when trying to review the PR :)
There was a problem hiding this comment.
FYI, the PR shows that there is a merge conflict with this file. Can you take a look and resolve it in your branch?
| public void flatMap(String value, Collector<Tuple2<String, Integer>> out) { | ||
| String[] tokens = value.toLowerCase().split("\\W+"); | ||
|
|
||
| for (String token : tokens) { |
There was a problem hiding this comment.
Why not use more of a Java 8 style to do this part.
There was a problem hiding this comment.
changed, to java 8 stream
| import org.apache.flink.api.java.aggregation.Aggregations; | ||
| import org.apache.flink.api.java.tuple.Tuple2; | ||
|
|
||
| import java.util.Arrays; |
| import org.apache.flink.api.java.tuple.Tuple2; | ||
| import org.apache.flink.util.Collector; | ||
|
|
||
| public class LineSplitter implements FlatMapFunction<String, Tuple2<String, Integer>> { |
There was a problem hiding this comment.
add @SuppressWarnings("serial") to class or create the default serial version ID
|
|
||
| return text.flatMap(new LineSplitter()) | ||
| .groupBy(0) | ||
| .aggregate(Aggregations.SUM, 1); |
There was a problem hiding this comment.
you could just use .sum(1) instead of the .aggregate(...)
There was a problem hiding this comment.
yes, i know but aggregate is more flexible because you can pass other types of operations, so i prefer to show that method, ok?
There was a problem hiding this comment.
Yes it's ok, just making an observation, not necessarily requesting a change.
| @Test | ||
| public void givenListOfAmounts_whenUseMapReduce_thenSumAmountsThatAreOnlyAboveThreshold() throws Exception { | ||
| //given | ||
| DataSource<Integer> amounts = env.fromElements(1, 29, 40, 50); |
There was a problem hiding this comment.
Why not use DataSet instead of DataSource?
There was a problem hiding this comment.
I can change it to DataSet, DataSource is a subtype of it so it is better to have the abstract DataSet
| import static org.assertj.core.api.Assertions.assertThat; | ||
|
|
||
|
|
||
| public class WordCountTest { |
There was a problem hiding this comment.
Let's make it WordCountIntegrationTest
No description provided.