IoT based Ingestion
Data Pipes allow ingestion of data from MQTT based sources. MQTT clients or publishers are expected to push the data to the MQTT broker.
With AWS, we leverage AWS IoT core. Once the data is pushed to the broker, Data Pipes loads the data to the data lake in near real-time.
Click on the pipeline icon on the left hand side of the Data Pipes portal
You will be landed on the ingestion pipelines page, where one can see all the pipelines initiated by the user.
Click on the plus icon to create a new pipeline.
Once you click on the plus icon, click on the configure source block to create the source.
From the options choose STREAM
Click on + icon in front of STREAM
The form will open up like below
Source name: [Mandatory] A unique name to identify the source.
Topic Name: [Mandatory] MQTT Topic name where the sensors are pushing the data
Once the source is configured, click on the configure destination option and select the default destination of the data lake.
STREAM ingestion supports only 1 mode at present:
STREAM: In this mode, the pipeline will run continuously and any data pushed to the MQTT topic, gets loaded in near real time to the data lake
Once the replication mode is selected, click on configure pipeline to provide details like - pipeline name, domain name (where the data is to be loaded), table name.
Click on the play button to configure verify the configuration.
Once you create the pipeline after verifying all the details, the “start replication” button will appear at the bottom. Click on it to begin the ingestion.