-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Revamped README for DAI MOJO Deployment to CEM (EFM, NiFi Reg, MiNiFi…
… CPP removed most of the commands from the README, created a setup.sh script using most of the commands from the previous README. Also added new commands that handle installing CEM components (Edge Flow Manager, NiFi Registry, etc) that way users can build data flows for MiNiFi C++ in a nice UI provided by EFM for drag and drop processors for data flow building. Additionally, users can still use the custom MiNiFi Python Processors related to h2o.ai for deploying the Driverless AI MOJO Scoring Pipeline in a MiNiFi C++ Data Flow. Added many images to make it easier to use EFM to build MiNiFi C++ Data Flows. Added a troubleshooting section in case one doesn't see the custom Python Processors for h2o.ai and how they can solve it. Added a troubleshooting section on checking if EFM, NiFi Registry or MiNiFi C++ Agent is running. I made sure to incorporate Nick Png's and Edge Orendain's feedback.
- Loading branch information
Showing
33 changed files
with
1,440 additions
and
977 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,150 @@ | ||
# Web Server Properties | ||
# address: the hostname or ip address of the interface to bind to; to bind to all, use 0.0.0.0 | ||
efm.server.address=EFM_SERVER_IP | ||
efm.server.port=10080 | ||
efm.server.servlet.contextPath=/efm | ||
|
||
# Cluster Properties | ||
# address: the address (host:port) to bind to for the embedded Hazelcast instance that coordinates cluster state | ||
# memberAddress: the address (host:port) to advertise to other cluster members, if different from the bindAddress | ||
# members: comma-separated list all cluster nodes; must be identical on all nodes in the cluster, including order | ||
# format of node address is hostname or IP or hostname:port or IP:port | ||
# port is optional (5701 the default port) | ||
efm.cluster.enabled=false | ||
|
||
# Cluster TLS/SSL Tunnel Properties | ||
# enabled: enable secure communication within the cluster via a stunnel proxy | ||
# command: the command or path to executable for stunnel, which must be installed, e.g., /usr/bin/stunnel | ||
# logLevel: the level of stunnel debug output: emerg|alert|crit|err|warning|notice|info|debug | ||
# logFile: (optional) if specified, the file to use for stunnel logs. if not specified, output is to EFM App Log | ||
# caFile: The file containing Certificate Authority certificates. Must be PEM format. | ||
# cert: The file containing this cluster node's public certificate. Must be PEM format. | ||
# key: The file containing this cluster node's private key. Must be PEM format. Can be encrypted or unencrypted | ||
# keyPassword: (optional) If the key file is encrypted with a password, the password to decrypt the key file. | ||
# proxyServerPort: the port that will receive the TLS traffic and redirect to Hazelcast (default 10090) | ||
# proxyClientPortStart: starting with the given port, the ports used to proxy communication with other cluster members | ||
# over the secure TLS tunnel (default 10091). The number of ports used is one fewer than the number of cluster members. | ||
# For additional Stunnel configuration options, see https://www.stunnel.org/static/stunnel.html | ||
# global options, service level options, or client-/server-specific server options can be specified as | ||
# key-value pairs with the appropriate prefix efm.cluster.stunnel.[global|service|clientService|serverService].* | ||
efm.cluster.stunnel.enabled=false | ||
efm.cluster.stunnel.command=stunnel | ||
efm.cluster.stunnel.logLevel=warning | ||
efm.cluster.stunnel.caFile= | ||
efm.cluster.stunnel.cert= | ||
efm.cluster.stunnel.key= | ||
efm.cluster.stunnel.keyPassword= | ||
efm.cluster.stunnel.proxyServerPort=10090 | ||
efm.cluster.stunnel.proxyClientPortStart=10091 | ||
|
||
# Web Server TLS Properties | ||
efm.server.ssl.enabled=false | ||
efm.server.ssl.keyStore=./conf/keystore.jks | ||
efm.server.ssl.keyStoreType=jks | ||
efm.server.ssl.keyStorePassword= | ||
efm.server.ssl.keyPassword= | ||
efm.server.ssl.trustStore=./conf/truststore.jks | ||
efm.server.ssl.trustStoreType=jks | ||
efm.server.ssl.trustStorePassword= | ||
efm.server.ssl.clientAuth=WANT | ||
|
||
# User Authentication Properties | ||
# authentication via TLS mutual auth with client certificates | ||
efm.security.user.certificate.enabled=false | ||
# authentication via Knox SSO token passed in a cookie header | ||
efm.security.user.knox.enabled=false | ||
efm.security.user.knox.url= | ||
efm.security.user.knox.publicKey= | ||
efm.security.user.knox.cookieName= | ||
efm.security.user.knox.audiences= | ||
# authentication via generic reverse proxy with user passed in a header | ||
efm.security.user.proxy.enabled=false | ||
efm.security.user.proxy.headerName=x-webauth-user | ||
|
||
# NiFi Registry Properties | ||
# url: the base URL of a NiFi Registry instance | ||
# bucket: Only set one of bucketId OR bucketName | ||
# flowRefreshInterval: specify value and units (d=days, h=hours, m=minutes, s=seconds, ms=milliseconds) | ||
efm.nifi.registry.enabled=true | ||
efm.nifi.registry.url=http://EFM_SERVER_IP:18080 | ||
efm.nifi.registry.bucketId= | ||
efm.nifi.registry.bucketName=DaiMojo | ||
efm.nifi.registry.flowRefreshInterval=60s | ||
|
||
# Database Properties | ||
efm.db.url=jdbc:postgresql://EFM_SERVER_IP:5432/efm | ||
efm.db.driverClass=org.postgresql.Driver | ||
efm.db.username=efm | ||
efm.db.password=clouderah2oai | ||
efm.db.maxConnections=50 | ||
efm.db.sqlDebug=false | ||
|
||
# Heartbeat Retention Properties | ||
# For maxAgeToKeep, specify value and units (d=days, h=hours, m=minutes, s=seconds, ms=milliseconds) | ||
# Set to 0 to disable persisting events entirely | ||
efm.heartbeat.maxAgeToKeep=0 | ||
efm.heartbeat.persistContent=false | ||
|
||
# Event Retention Properties | ||
# Specify value and units (d=days, h=hours, m=minutes, s=seconds, ms=milliseconds) | ||
# Set to 0 to disable persisting events entirely | ||
# Set no value to disable auto-cleanup (manual deletion only) | ||
efm.event.cleanupInterval=30s | ||
efm.event.maxAgeToKeep.debug=0m | ||
efm.event.maxAgeToKeep.info=1h | ||
efm.event.maxAgeToKeep.warn=1d | ||
efm.event.maxAgeToKeep.error=7d | ||
|
||
# Agent Class Flow Monitor Properties | ||
# Specify value and units (d=days, h=hours, m=minutes, s=seconds, ms=milliseconds) | ||
efm.agent-class-monitor.interval=15s | ||
|
||
# Agent Monitoring Properties | ||
# Specify value and units (d=days, h=hours, m=minutes, s=seconds, ms=milliseconds) | ||
# Set to zero to disable threshold monitoring entirely | ||
efm.monitor.maxHeartbeatInterval=5m | ||
|
||
# Operation Properties | ||
efm.operation.monitoring.enabled=true | ||
efm.operation.monitoring.inDeployedStateTimeout=5m | ||
efm.operation.monitoring.inDeployedStateCheckFrequency=1m | ||
efm.operation.monitoring.rollingBatchOperationsSize=10 | ||
efm.operation.monitoring.rollingBatchOperationsFrequency=5s | ||
|
||
# Metrics Properties | ||
management.metrics.export.simple.enabled=false | ||
management.metrics.export.prometheus.enabled=true | ||
management.metrics.enable.efm.heartbeat=true | ||
management.metrics.enable.efm.agentStatus=true | ||
management.metrics.enable.efm.flowStatus=true | ||
management.metrics.enable.efm.repo=true | ||
management.metrics.efm.enable-tag.efmHost=true | ||
management.metrics.efm.enable-tag.agentClass=true | ||
management.metrics.efm.enable-tag.agentManifestId=true | ||
management.metrics.efm.enable-tag.agentId=true | ||
management.metrics.efm.enable-tag.deviceId=false | ||
management.metrics.efm.enable-tag.flowId=true | ||
management.metrics.efm.enable-tag.connectionId=true | ||
management.metrics.efm.max-tags.agentClass=100 | ||
management.metrics.efm.max-tags.agentManifestId=10 | ||
management.metrics.efm.max-tags.agentId=100 | ||
management.metrics.efm.max-tags.deviceId=100 | ||
management.metrics.efm.max-tags.flowId=100 | ||
management.metrics.efm.max-tags.connectionId=1000 | ||
|
||
# EL Specification Properties | ||
efm.el.specifications.dir=./specs | ||
|
||
# Logging Properties | ||
# logging.level.{logger-name}={DEBUG|INFO|WARN|ERROR} | ||
logging.level.com.cloudera.cem.efm=INFO | ||
logging.level.com.hazelcast=WARN | ||
logging.level.com.hazelcast.internal.cluster.ClusterService=INFO | ||
logging.level.com.hazelcast.internal.nio.tcp.TcpIpConnection=ERROR | ||
logging.level.com.hazelcast.internal.nio.tcp.TcpIpConnector=ERROR | ||
|
||
# Encryption Password used for encrypting sensitive data saved to the EFM server | ||
efm.encryption.password=clouderah2oai | ||
|
||
# This property did not exist, so we added it anywhere in this file. Default is 'First In' | ||
efm.manifest.strategy=Last In |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
# Core Properties # | ||
nifi.version=0.7.0 | ||
nifi.flow.configuration.file=./conf/config.yml | ||
nifi.administrative.yield.duration=30 sec | ||
# If a component has no work to do (is "bored"), how long should we wait before checking again for work? | ||
nifi.bored.yield.duration=10 millis | ||
|
||
# Provenance Repository # | ||
nifi.provenance.repository.directory.default=${MINIFI_HOME}/provenance_repository | ||
nifi.provenance.repository.max.storage.time=1 MIN | ||
nifi.provenance.repository.max.storage.size=1 MB | ||
nifi.flowfile.repository.directory.default=${MINIFI_HOME}/flowfile_repository | ||
nifi.database.content.repository.directory.default=${MINIFI_HOME}/content_repository | ||
|
||
nifi.c2.root.classes=DeviceInfoNode,AgentInformation,FlowInformation | ||
## define metrics reported | ||
nifi.c2.root.class.definitions=metrics | ||
nifi.c2.root.class.definitions.metrics.name=metrics | ||
nifi.c2.root.class.definitions.metrics.metrics=typedmetrics | ||
nifi.c2.root.class.definitions.metrics.metrics.typedmetrics.name=RuntimeMetrics | ||
nifi.c2.root.class.definitions.metrics.metrics.queuemetrics.name=QueueMetrics | ||
nifi.c2.root.class.definitions.metrics.metrics.queuemetrics.classes=QueueMetrics | ||
nifi.c2.root.class.definitions.metrics.metrics.typedmetrics.classes=ProcessMetrics,SystemInformation | ||
nifi.c2.root.class.definitions.metrics.metrics.processorMetrics.name=ProcessorMetric | ||
nifi.c2.root.class.definitions.metrics.metrics.processorMetrics.classes=GetFileMetrics | ||
|
||
#JNI properties | ||
nifi.framework.dir=${MINIFI_HOME}/minifi-jni/lib | ||
nifi.nar.directory=${MINIFI_HOME}/minifi-jni/nars | ||
nifi.nar.deploy.directory=${MINIFI_HOME}/minifi-jni/nardeploy | ||
nifi.nar.docs.directory=${MINIFI_HOME}/minifi-jni/nardocs | ||
# must be comma separated | ||
nifi.jvm.options=-Xmx1G | ||
nifi.python.processor.dir=${MINIFI_HOME}/minifi-python/,${MINIFI_HOME}/minifi-python/h2o/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
# PostgreSQL Client Authentication Configuration File | ||
# =================================================== | ||
# | ||
# Refer to the "Client Authentication" section in the PostgreSQL | ||
# documentation for a complete description of this file. A short | ||
# synopsis follows. | ||
# | ||
# This file controls: which hosts are allowed to connect, how clients | ||
# are authenticated, which PostgreSQL user names they can use, which | ||
# databases they can access. Records take one of these forms: | ||
# | ||
# local DATABASE USER METHOD [OPTIONS] | ||
# host DATABASE USER ADDRESS METHOD [OPTIONS] | ||
# hostssl DATABASE USER ADDRESS METHOD [OPTIONS] | ||
# hostnossl DATABASE USER ADDRESS METHOD [OPTIONS] | ||
# | ||
# (The uppercase items must be replaced by actual values.) | ||
# | ||
# The first field is the connection type: "local" is a Unix-domain | ||
# socket, "host" is either a plain or SSL-encrypted TCP/IP socket, | ||
# "hostssl" is an SSL-encrypted TCP/IP socket, and "hostnossl" is a | ||
# plain TCP/IP socket. | ||
# | ||
# DATABASE can be "all", "sameuser", "samerole", "replication", a | ||
# database name, or a comma-separated list thereof. The "all" | ||
# keyword does not match "replication". Access to replication | ||
# must be enabled in a separate record (see example below). | ||
# | ||
# USER can be "all", a user name, a group name prefixed with "+", or a | ||
# comma-separated list thereof. In both the DATABASE and USER fields | ||
# you can also write a file name prefixed with "@" to include names | ||
# from a separate file. | ||
# | ||
# ADDRESS specifies the set of hosts the record matches. It can be a | ||
# host name, or it is made up of an IP address and a CIDR mask that is | ||
# an integer (between 0 and 32 (IPv4) or 128 (IPv6) inclusive) that | ||
# specifies the number of significant bits in the mask. A host name | ||
# that starts with a dot (.) matches a suffix of the actual host name. | ||
# Alternatively, you can write an IP address and netmask in separate | ||
# columns to specify the set of hosts. Instead of a CIDR-address, you | ||
# can write "samehost" to match any of the server's own IP addresses, | ||
# or "samenet" to match any address in any subnet that the server is | ||
# directly connected to. | ||
# | ||
# METHOD can be "trust", "reject", "md5", "password", "gss", "sspi", | ||
# "ident", "peer", "pam", "ldap", "radius" or "cert". Note that | ||
# "password" sends passwords in clear text; "md5" is preferred since | ||
# it sends encrypted passwords. | ||
# | ||
# OPTIONS are a set of options for the authentication in the format | ||
# NAME=VALUE. The available options depend on the different | ||
# authentication methods -- refer to the "Client Authentication" | ||
# section in the documentation for a list of which options are | ||
# available for which authentication methods. | ||
# | ||
# Database and user names containing spaces, commas, quotes and other | ||
# special characters must be quoted. Quoting one of the keywords | ||
# "all", "sameuser", "samerole" or "replication" makes the name lose | ||
# its special character, and just match a database or username with | ||
# that name. | ||
# | ||
# This file is read on server startup and when the postmaster receives | ||
# a SIGHUP signal. If you edit the file on a running system, you have | ||
# to SIGHUP the postmaster for the changes to take effect. You can | ||
# use "pg_ctl reload" to do that. | ||
|
||
# Put your actual configuration here | ||
# ---------------------------------- | ||
# | ||
# If you want to allow non-local connections, you need to add more | ||
# "host" records. In that case you will also need to make PostgreSQL | ||
# listen on a non-local interface via the listen_addresses | ||
# configuration parameter, or via the -i or -h command line switches. | ||
|
||
|
||
|
||
|
||
# DO NOT DISABLE! | ||
# If you change this first entry you will need to make sure that the | ||
# database superuser can access the database using some other method. | ||
# Noninteractive access to all databases is required during automatic | ||
# maintenance (custom daily cronjobs, replication, and similar tasks). | ||
# | ||
# Database administrative login by Unix domain socket | ||
local all postgres peer | ||
|
||
# TYPE DATABASE USER ADDRESS METHOD | ||
|
||
# "local" is for Unix domain socket connections only | ||
local all all trust | ||
# IPv4 local connections: | ||
host all all 0.0.0.0/0 trust | ||
# IPv6 local connections: | ||
host all all ::/0 trust | ||
# Allow replication connections from localhost, by a user with the | ||
# replication privilege. | ||
#local replication postgres peer | ||
#host replication postgres 127.0.0.1/32 md5 | ||
#host replication postgres ::1/128 md5 |
Oops, something went wrong.