Skip to main content

Data streaming

Streaming data must be durably captured by massively scalable storage that is capable of handling high data volume from data producers. A producer can be thousands of data sources, each generating streaming data continuously and which, typically, submit records simultaneously and in small sizes (kilobytes).

Streaming data includes a wide variety of data such as log files generated by customers using mobile or web applications, ecommerce purchases, in-game player activity, information from social networks, financial trading floors, or geospatial services and telemetry from connected devices or instrumentation in data centers.

Case 1

kds-kda-api-segamaker

  1. An Amazon EC2 instance that uses the Amazon Kinesis Producer Library (KPL) to generate data.
  2. Kinesis Data Streams stores the incoming streaming data.
  3. Kinesis Data Analytics processes the incoming records and asynchronously invokes an external endpoint.
  4. The demo application invokes an AWS Lambda function.
  5. The external API can be any integration supported by Amazon API Gateway (for example, an Amazon SageMaker endpoint).