Visual Data Flow 11. Apache NiFi .
Here are three examples of Apache NiFi code, along with qualitative and technical explanations for each:
Example 1: Ingesting Data from a CSV File
Qualitative Explanation:
Purpose: Automate the ingestion of data from a CSV file into a database.
Use Case: ETL (Extract, Transform, Load) workflows for data integration.
Outcome: Data from the CSV file is processed and stored in a database.
Technical Explanation:
Create Processor: Use
GetFileto read the CSV file.<processor class="org.apache.nifi.processors.standard.GetFile"> <property name="Input Directory">/path/to/csv</property> </processor>Transform Data: Use
ConvertRecordto parse CSV into JSON.<processor class="org.apache.nifi.processors.standard.ConvertRecord"> <property name="Record Reader">CSVReader</property> <property name="Record Writer">JSONWriter</property> </processor>Load Data: Use
PutDatabaseRecordto insert data into a database.<processor class="org.apache.nifi.processors.standard.PutDatabaseRecord"> <property name="Database URL">jdbc:mysql://localhost:3306/mydb</property> <property name="Table Name">mytable</property> </processor>Error Handling: Use
LogAttributeto log errors.<processor class="org.apache.nifi.processors.standard.LogAttribute"> <property name="Log Level">ERROR</property> </processor>
Example 2: Real-Time Data Streaming with Kafka
Qualitative Explanation:
Purpose: Stream real-time data from Kafka topics for processing.
Use Case: Real-time analytics and monitoring of streaming data.
Outcome: Data from Kafka is processed and routed to downstream systems.
Technical Explanation:
Consume Kafka: Use
ConsumeKafkato read data from Kafka.<processor class="org.apache.nifi.processors.kafka.pubsub.ConsumeKafka"> <property name="Kafka Brokers">localhost:9092</property> <property name="Topic Name">mytopic</property> </processor>Transform Data: Use
JoltTransformJSONto modify JSON data.<processor class="org.apache.nifi.processors.standard.JoltTransformJSON"> <property name="Jolt Specification">{"operation":"shift","spec":{"foo":"bar"}}</property> </processor>Route Data: Use
RouteOnAttributeto route data based on content.<processor class="org.apache.nifi.processors.standard.RouteOnAttribute"> <property name="Routing Strategy">Route to Property name</property> </processor>Store Data: Use
PutHDFSto store data in Hadoop.<processor class="org.apache.nifi.processors.hadoop.PutHDFS"> <property name="Hadoop Configuration Resources">/path/to/hadoop/conf</property> <property name="Directory">/data/output</property> </processor>
Example 3: Data Enrichment with REST API
Qualitative Explanation:
Purpose: Enrich data by calling a REST API for additional information.
Use Case: Enhancing datasets with external data sources.
Outcome: Data is enriched with additional attributes from the API.
Technical Explanation:
Fetch Data: Use
InvokeHTTPto call a REST API.<processor class="org.apache.nifi.processors.standard.InvokeHTTP"> <property name="HTTP Method">GET</property> <property name="Remote URL">http://api.example.com/data</property> </processor>Merge Data: Use
MergeContentto combine API responses with original data.<processor class="org.apache.nifi.processors.standard.MergeContent"> <property name="Merge Format">JSON</property> </processor>Transform Data: Use
ReplaceTextto update data fields.<processor class="org.apache.nifi.processors.standard.ReplaceText"> <property name="Replacement Value">{"newField":"value"}</property> </processor>Store Data: Use
PutFileto save enriched data.<processor class="org.apache.nifi.processors.standard.PutFile"> <property name="Directory">/path/to/output</property> </processor>
Input and Output Examples
Example 1: Ingesting Data from a CSV File
Input (
input.csv):id,name,age 1,Alice,30 2,Bob,25Output (Database Table):
id | name | age 1 | Alice | 30 2 | Bob | 25
Example 2: Real-Time Data Streaming with Kafka
Input (Kafka Topic):
{"foo":"bar"}Output (HDFS):
{"bar":"foo"}
Example 3: Data Enrichment with REST API
Input (Original Data):
{"id":1,"name":"Alice"}Output (Enriched Data):
{"id":1,"name":"Alice","newField":"value"}