Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[issue 63] Flink alert sample #64

Closed
wants to merge 51 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
502e2e6
example to generate alert from apache access log via logstash pravega…
hldnova Mar 23, 2018
75c4815
correct typo
hldnova May 17, 2018
27e0792
apply filter earlier to let in just 500 responses
hldnova May 17, 2018
7a01f4b
use jackson to covert object to/from json
hldnova May 18, 2018
44842c0
merge with develop branch
hldnova May 18, 2018
e59f71d
example to generate alert from apache access log via logstash pravega…
hldnova Mar 23, 2018
b3728a4
correct typo
hldnova May 17, 2018
90a94e4
apply filter earlier to let in just 500 responses
hldnova May 17, 2018
d2de986
merge with develop branch
hldnova May 18, 2018
95343a6
use jackson to covert object to/from json
hldnova May 18, 2018
213c716
merge with develop branch
hldnova May 18, 2018
f18c900
use builder api
hldnova May 18, 2018
6a52977
update REAMDME
hldnova May 18, 2018
b85510c
update tostring method
hldnova May 18, 2018
b9c4acf
change class member to follow java bean naming convention
hldnova May 22, 2018
78165e3
support secure connection to pravega
hldnova May 24, 2018
2508e10
Add steps to check state of pravega and logstash
hldnova Jun 5, 2018
e6e621d
Merge branch 'develop' into flinkalert-63
hldnova Jun 5, 2018
20ca9cf
fix bug in build script
hldnova Jun 5, 2018
6f611b7
[issue-95] Remove transaction grace period API call (#96)
vijikarthi Jun 12, 2018
d7e7e94
[issue 75] hadoop-connectors examples (#77) (#101)
RaulGracia Jun 13, 2018
715af35
Issue 81: Review and organize docs (#86) (#102)
RaulGracia Jun 13, 2018
82b1157
Issue 73: Streamcuts samples (#80) (#103)
RaulGracia Jun 13, 2018
acb70f6
Issue 83: Reorganize repository structure (#105)
RaulGracia Jun 15, 2018
43ebd5d
Update README.md
EronWright Jun 15, 2018
b9cd292
[issue-104] Fixed incorrect hadoop connector artifact reference (#106)
vijikarthi Jun 18, 2018
985f3ff
Issue 85: turbineSensor throws occasional NPE (#111)
RaulGracia Jun 18, 2018
73a14cd
Issue 108: Relative documentation links. (#109)
RaulGracia Jun 18, 2018
e3fb288
Issue 116: Deleted upload instructions in Flink connector samples, as…
RaulGracia Jun 19, 2018
9987ad9
Issue 84: Channel not shutdown correctly in TurbineHeatSensor sample …
RaulGracia Jun 20, 2018
ff6e127
[issue-121] Fix transaction timeout API change (#122)
vijikarthi Jun 20, 2018
eb29d17
example to generate alert from apache access log via logstash pravega…
hldnova Mar 23, 2018
44d2ab0
correct typo
hldnova May 17, 2018
6bf0c08
apply filter earlier to let in just 500 responses
hldnova May 17, 2018
b3c8451
use jackson to covert object to/from json
hldnova May 18, 2018
0be606e
merge with develop branch
hldnova May 18, 2018
a651957
example to generate alert from apache access log via logstash pravega…
hldnova Mar 23, 2018
ec03f2c
correct typo
hldnova May 17, 2018
f4c4a61
apply filter earlier to let in just 500 responses
hldnova May 17, 2018
7e8020b
merge with develop branch
hldnova May 18, 2018
d3e5b4b
use jackson to covert object to/from json
hldnova May 18, 2018
d0fd8c5
use builder api
hldnova May 18, 2018
16ffc4e
update REAMDME
hldnova May 18, 2018
c70ba9b
update tostring method
hldnova May 18, 2018
54a42ca
change class member to follow java bean naming convention
hldnova May 22, 2018
eb0f8e5
support secure connection to pravega
hldnova May 24, 2018
7f037a6
Add steps to check state of pravega and logstash
hldnova Jun 5, 2018
266298b
fix bug in build script
hldnova Jun 5, 2018
5a32098
re-organize the sample to new org structure
hldnova Jun 22, 2018
2b70db8
re-organize the sample to new org structure
hldnova Jun 22, 2018
b7bec54
Merge branch 'flinkalert-63' of https://github.com/pravega/pravega-sa…
hldnova Jun 22, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
238 changes: 170 additions & 68 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,104 +1,206 @@
# pravega-samples
# Pravega and Analytics Connectors Examples

Sample applications for Pravega.
This repository contains code samples to demonstrate how developers can work with
[Pravega](http://pravega.io). We also provide code samples to connect analytics
engines such as [Flink](https://flink.apache.org/) and
[Hadoop](http://hadoop.apache.org/) with Pravega as a storage substrate for data
streams.

## Getting Started
For more information on Pravega, we recommend to read the [documentation and the
developer guide](http://pravega.io).

### Building Pravega
# Repository Structure

Optional: This step is required only if you want to use a different version
of Pravega than is published to maven central.
This repository is divided into sub-projects (`pravega-client-examples`, `flink-connector-examples`
and `hadoop-connector-examples`), each one addressed to demonstrate a specific component. In these sub-projects,
we provide a battery of simple code examples aimed at illustrating how a particular
feature or API works. Moreover, we also include a `scenarios` folder that contains
more complex applications as sub-projects, which show use-cases exploiting one or multiple components.

> Hint: Have a look to the [terminology and concepts](http://pravega.io/docs/latest/terminology/) in Pravega.

## Pravega Client Examples
| Example Name | Description | Language |
| ------------- |:-----| :-----|
| `gettingstarted` | Simple example of how to read/write from/to a Pravega `Stream`. | [Java](pravega-client-examples/src/main/java/io/pravega/example/gettingstarted)
| `consolerw` | Application that allows users to work with `Stream`, `Transaction` and `StreamCut` APIs via CLI. | [Java](pravega-client-examples/src/main/java/io/pravega/example/consolerw)
| `noop` | Example of how to add a simple callback executed upon a read event. | [Java](pravega-client-examples/src/main/java/io/pravega/example/noop)
| `statesynchronizer` | Application that allows users to work with `StateSynchronizer` API via CLI. | [Java](pravega-client-examples/src/main/java/io/pravega/example/statesynchronizer)
| `streamcuts` | Application examples demonstrating the use of `StreamCut`s via CLI. | [Java](pravega-client-examples/src/main/java/io/pravega/example/streamcuts)

The related documentation and instructions are [here](pravega-client-examples).

## Flink Connector Examples
| Example Name | Description | Language |
| ------------- |:-----| :-----|
| `wordcount` | Counting the words continuously from a Pravega `Stream` to demonstrate the usage of Flink connector for Pravega. | [Java](flink-connector-examples/src/main/java/io/pravega/example/flink/wordcount)
| `primer` | This sample demonstrates Pravega "exactly-once" feature jointly with Flink checkpointing and exactly-once mode. | [Java](flink-connector-examples/src/main/java/io/pravega/example/flink/primer)

The related documentation and instructions are [here](flink-connector-examples).

## Hadoop Connector Examples
| Example Name | Description | Language |
| ------------- |:-----| :-----|
| `wordcount` | Counts the words from a Pravega `Stream` filled with random text to demonstrate the usage of Hadoop connector for Pravega. | [Java](hadoop-connector-examples/src/main/java/io/pravega/example/hadoop/wordcount)

The related documentation and instructions are [here](hadoop-connector-examples).

## Scenarios
| Example Name | Description | Language |
| ------------- |:-----| :-----|
| [`turbineheatsensor`](scenarios/turbine-heat-sensor) | It emulates parallel sensors producing temperature values (writers) and parallel consumers performing real-time statistics (readers) via Pravega client. | [Java](scenarios/turbine-heat-sensor/src/main/java/io/pravega/turbineheatsensor)
| [`turbineheatprocessor`](scenarios/turbine-heat-processor) | A Flink streaming application for processing temperature data from a Pravega stream produced by the `turbineheatsensor` app. The application computes a daily summary of the temperature range observed on that day by each sensor. | [Java](scenarios/turbine-heat-processor/src/main/java/io/pravega/turbineheatprocessor), [Scala](scenarios/turbine-heat-processor/src/main/scala/io/pravega/turbineheatprocessor)
| [`anomaly-detection`](scenarios/anomaly-detection) | A Flink streaming application for detecting anomalous input patterns using a finite-state machine. | [Java](scenarios/anomaly-detection/src/main/java/io/pravega/anomalydetection)


# Build Instructions

Next, we provide instructions for building the `pravega-samples` repository. There are two main options:
- _Out-of-the-box_: If you want a quick start, run the samples by building `pravega-samples` out-of-the-box
(go straight to section `Pravega Samples Build Instructions`).
- _Build from source_: If you want to have fun building the different projects from source, please read
section `Building Pravega Components from Source (Optional)` before building `pravega-samples`.

## Pre-requisites

* Java 8

## Building Pravega Components from Source (Optional)

### Pravega Build Instructions

If you want to build Pravega from source, you may need to generate the latest Pravega `jar` files and install them to
your local Maven repository. To build Pravega from sources and use it here, please run the following commands:

Install the Pravega client libraries to your local Maven repository:
```
$ git clone https://github.com/pravega/pravega.git
$./gradlew install
$ cd pravega
$ ./gradlew install
```

### Building the Flink Connector
The above command should generate the required `jar` files into your local Maven repository.

Optional: This step is required only if you want to use a different version
of Pravega than is published to maven central.
> Hint: For using in the sample applications the Pravega version you just built, you need to update the
`pravegaVersion=<local_maven_pravega_version>` property in `gradle.properties` file
of `pravega-samples`.

For more information, please visit [Pravega](https://github.com/pravega/pravega).

### Flink Connector Build Instructions

To build the Flink connector from source, follow the below steps to build and publish artifacts from
source to local Maven repository:

Install the shaded Flink Connector library to your local Maven repository:
```
$ git clone https://github.com/pravega/flink-connectors.git
$./gradlew install
$ git clone --recursive https://github.com/pravega/flink-connectors.git
$ cd flink-connectors
$ ./gradlew install
```

### Building the Samples
Use the built-in gradle wrapper to build the samples.
> Hint: For using in the sample applications the Flink connector version you just built, you need to update the
`flinkConnectorVersion=<local_maven_flink_connector_version>` property in `gradle.properties` file
of `pravega-samples`.


For more information, please visit [Flink Connectors](https://github.com/pravega/flink-connectors).

### Hadoop Connector Build Instructions

To build the Hadoop connector from source, follow the below steps to build and publish artifacts from
source to local Maven repository:

```
$ ./gradlew build
...
BUILD SUCCESSFUL
$ git clone --recurse-submodules https://github.com/pravega/hadoop-connectors.git
$ cd hadoop-connectors
$ ./gradlew install
```

### Distributing (Flink Samples)
#### Assemble
Use gradle to assemble a distribution folder containing the Flink programs as a ready-to-deploy uber-jar called `pravega-flink-examples-0.1.0-SNAPSHOT-all.jar`.
```
$ ./gradlew installDist
...
$ ls -R flink-examples/build/install/pravega-flink-examples
bin lib
> Hint: For using in the sample applications the Hadoop connector version you just built, you need to update the
`hadoopConnectorVersion=<local_maven_hadoop_connector_version>` property in `gradle.properties` file
of `pravega-samples`.


For more information, please visit [Hadoop Connectors](https://github.com/pravega/hadoop-connectors).

### Configuring Pravega Samples for Running with Source Builds

In the previous instructions, we noted that you will need to change the `gradle.properties` file in
`pravega-samples` for using the Pravega components built from source. Here we provide an example of how to do so:

1) Imagine that we want to build Pravega from source. Let us assume that we
executed `git clone https://github.com/pravega/pravega.git` and the last commit of
`master` branch is `2990193xxx`.

2) After executing `./gradlew install`, we will see in our local Maven repository
(e.g., `~/.m2/repository/io/pravega/*`) artifacts that contain in their names that commit version
such as `0.3.0-1889.2990193-SNAPSHOT`. These artifacts are the result from building Pravega from source.

3) The only thing you have to do is to set `pravegaVersion=0.3.0-1889.2990193-SNAPSHOT` in the `gradle.properties`
file of `pravega-samples`.

flink-examples/build/install/pravega-flink-examples/bin:
run-example
While this example is for Pravega, the same procedure applies for Flink and Hadoop connectors.


## Pravega Samples Build Instructions

The `pravega-samples` project is prepared for working out-of-the-box with
[release artifacts](https://github.com/pravega/pravega/releases) of Pravega components, which are already
available in Maven central. To build `pravega-samples` from source, use the built-in gradle wrapper as follows:

flink-examples/build/install/pravega-flink-examples/lib:
pravega-flink-examples-0.1.0-SNAPSHOT-all.jar
```
$ git clone https://github.com/pravega/pravega-samples.git
$ cd pravega-samples
$ ./gradlew clean installDist
```
That's it! You are good to go and execute the examples :)

#### Upload
The `upload` task makes it easy to upload the sample binaries to your cluster. First, configure Gradle
with the address of a node in your cluster. Edit `~/.gradle/gradle.properties` to specify a value for `dcosAddress`.
To ease their execution, most examples can be run either using the gradle wrapper (gradlew) or scripts.
The above gradle command automatically creates the execution scripts that can be found under:

```
$ cat ~/.gradle/gradle.properties
dcosAddress=10.240.124.164
pravega-samples/pravega-client-examples/build/install/pravega-client-examples/bin
```

Then, upload the samples to the cluster. They'll be copied to `/home/centos` on the target node.
There is a Linux/Mac script and a Windows (.bat) script for each separate executable.

_Working with `develop` branch_: If you are curious about the most recent sample applications,
you may like to try the `develop` version of `pravega-samples` as well. To do so, just clone the
`develop` branch instead of `master` (default):

```
$ ./gradlew upload
$ git clone -b develop https://github.com/pravega/pravega-samples.git
$ cd pravega-samples
$ ./gradlew clean installDist
```

## Flink Samples
The `develop` branch works with Pravega snapshots artifacts published in
our [JFrog repository](https://oss.jfrog.org/artifactory/jfrog-dependencies/io/pravega/) instead of
using release versions.

### Anomaly Detection
A Flink streaming application for detecting anomalous input patterns using a finite-state machine.

_See the [anomaly-detection/](https://github.com/pravega/pravega-samples/tree/master/anomaly-detection) directory for more information._
# Proposed Roadmap

### Turbine Heat Processor
A Flink streaming application for processing temperature data from a Pravega stream. Complements the Turbine Heat Sensor app (external). The application computes a daily summary of the temperature range observed on that day by each sensor.
We propose a roadmap to proceed with the execution of examples based on their complexity:
1. [Pravega client examples](pravega-client-examples):
First step to understand the basics of Pravega and exercise the concepts presented in the documentation.
2. [Flink connector examples](flink-connector-examples):
These examples show the basic functionality of the Flink connector for Pravega.
3. [Hadoop connector examples](hadoop-connector-examples):
These examples show the basic functionality of the Hadoop connector for Pravega.
4. [Scenarios](scenarios): Applications that go beyond the basic usage of Pravega APIs, which may include complex interactions
between Pravega and analytics engines (e.g., Flink, Hadoop, Spark) to demonstrate analytics use cases.

Automatically creates a scope (default: `examples`) and stream (default: `turbineHeatTest`) as necessary.
# Where to Find Help

#### Running
Run the sample from the command-line:
```
$ bin/run-example [--controller <URI>] [--input <scope>/<stream>] [--startTime <long>] [--output <path>]
```
Documentation on Pravega and Analytics Connectors:
* [Pravega.io](http://pravega.io/), [Pravega Wiki](https://github.com/pravega/pravega/wiki).
* [Flink Connectors Wiki](https://github.com/pravega/flink-connectors/wiki).

Alternately, run the sample from the Flink UI.
- JAR: `pravega-flink-examples-0.1.0-SNAPSHOT-all.jar`
- Main class: `io.pravega.examples.flink.iot.TurbineHeatProcessor` or `io.pravega.examples.flink.iot.TurbineHeatProcessorScala`
Did you find a problem or bug?
* First, check our [FAQ](http://pravega.io/docs/latest/faq/).
* If the FAQ does not help you, create a [new GitHub issue](https://github.com/pravega/pravega-samples/issues).

#### Outputs
The application outputs the daily summary as a comma-separated values (CSV) file, one line per sensor per day. The data is
also emitted to stdout (which may be viewed in the Flink UI). For example:
Do you want to contribute a new example application?
* Follow the [guidelines for contributors](https://github.com/pravega/pravega/wiki/Contributing).

```
...
SensorAggregate(1065600000,12,Illinois,(60.0,100.0))
SensorAggregate(1065600000,3,Arkansas,(60.0,100.0))
SensorAggregate(1065600000,7,Delaware,(60.0,100.0))
SensorAggregate(1065600000,15,Kansas,(40.0,80.0))
SensorAggregate(1152000000,3,Arkansas,(60.0,100.0))
SensorAggregate(1152000000,12,Illinois,(60.0,100.0))
SensorAggregate(1152000000,15,Kansas,(40.0,80.0))
SensorAggregate(1152000000,7,Delaware,(60.0,100.0))
...
```
Have fun!!
17 changes: 0 additions & 17 deletions build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -43,21 +43,4 @@ subprojects {
group "io.pravega"
version samplesVersion
}

remotes {
dcos {
host = dcosAddress
user = 'centos'
agent = true
}
}
ssh.settings {
knownHosts = allowAnyHosts
}

afterEvaluate {
task upload(type: Exec, dependsOn: installDist) {
commandLine 'rsync', '-az', project.installDist.destinationDir, "${remotes.dcos.user}@${remotes.dcos.host}:~"
}
}
}
49 changes: 49 additions & 0 deletions flink-connector-examples/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Flink Connector Examples for Pravega
Battery of code examples to demonstrate the capabilities of Pravega as a data stream storage
system for Apache Flink.

## Pre-requisites
1. Pravega running (see [here](http://pravega.io/docs/latest/getting-started/) for instructions)
2. Build [pravega-samples](https://github.com/pravega/pravega-samples) repository
3. Apache Flink running


## Distributing Flink Samples
Use gradle to assemble a distribution folder containing the Flink programs as a ready-to-deploy
uber-jar called `pravega-flink-examples-<VERSION>-all.jar`:

```
$ ./gradlew installDist
...
$ ls -R flink-connector-examples/build/install/pravega-flink-examples
bin lib

flink-connector-examples/build/install/pravega-flink-examples/bin:
run-example

flink-connector-examples/build/install/pravega-flink-examples/lib:
pravega-flink-examples-VERSION-all.jar
```

---

# Examples Catalog

## Word Count

This example demonstrates how to use the Pravega Flink Connectors to write data collected
from an external network stream into a Pravega `Stream` and read the data from the Pravega `Stream`.
See [wordcount](doc/flink-wordcount/README.md) for more information and execution instructions.


## Exactly Once Sample

This sample demonstrates Pravega EXACTLY_ONCE feature in conjuction with Flink checkpointing and exactly-once mode.
See [Exactly Once Sample](doc/exactly-once/README.md) for instructions.

## High Error Count Alert

This example demonstrates how to use the Pravega Flink connectors to read and
parse Apache access logs from logstash via the [logstash pravega output plugin](https://github.com/pravega/logstash-output-pravega),
and how to generate alert when error count is high within a time frame.
See [High Count Alerter](doc/high-count-alerter/README.md) for instructions.
Loading