nsq-to-bigquery – Stream messages from NSQ directly to Google BigQuery

nsq-to-bigquery

In the spirit of nsq-to-XXX such as nsq-to-http and nsq-to-file – I bring you the very first version of nsq-to-bigquery.

nsq-to-bigquery, as the name suggest, streams data from an NSQ channel into Google’s BigQuery using the Streaming API and provide very effective means to stream data that should be then further analysed and aggregated by BigQuery’s excellent performance.

This is a (very) initial version so it has some limitations and assumptions.

Limitations / Assumptions

  • The BigQuery table MUST exist prior to streaming the data
  • The NSQ message being sent MUST be a valid JSON string
  • The JSON format MUST be a simple flat dictionary (key and simple value. Value can’t be another dictionary or list)
  • The JSON format MUST match the format of the BigQuery table
  • At the moment there is no support for batching so each message will issue an API call to BigQuery with a single line of data

Planed Features:

  • Support batching with flushing based on X number of rows or Y amount of time passed since last flush
  • Flushing will happen in parallel with receiving information so there is almost no delay

Stay tuned on the github repo for more news.

Leave a Reply