nsq-to-bigquery – Stream messages from NSQ directly to Google BigQuery

nsq-to-bigquery

In the spirit of nsq-to-XXX such as nsq-to-http and nsq-to-file – I bring you the very first version of nsq-to-bigquery.

nsq-to-bigquery, as the name suggest, streams data from an NSQ channel into Google’s BigQuery using the Streaming API¬†and provide very effective means to stream data that should be then further analysed and aggregated by BigQuery’s excellent performance.

This is a (very) initial version so it has some limitations and assumptions.

Limitations / Assumptions

  • The BigQuery table MUST¬†exist prior to streaming the data
  • The NSQ message being sent MUST be a valid JSON string
  • The JSON format MUST be a simple flat dictionary (key and simple value. Value can’t be another dictionary or list)
  • The JSON format MUST match the format of the BigQuery table
  • At the moment there is no support for batching so each message will issue an API call to BigQuery with a single line of data

Planed Features:

  • Support batching with flushing based on X number of rows or Y amount of time passed since last flush
  • Flushing will happen in parallel with receiving information so there is almost no delay

Stay tuned on the github repo for more news.