google bigquery - Dealing with evolving schemas -


we gaming company stores events (up 1 giga events per day) bigquery. events sharded on month , application in order lower query costs.

now our problem.

our current solution supports adding new type of events leads new versions of table schema. versions has been added tables.

i.e. events_app1_v2_201308 , events_app1_v2_201308

if add events new column types in september events_app1_v3_201309

we have written code finds out involved tables (for date range) , makes union of them a'la bigquery's comma separeted clause.

but realised not work when make unions on different versions of event tables.

anyone has smart solution of how deal this!?

right investigating if json structures us. current solution flat columns. [timestamp, eventid, value, value, value, ...]

from https://developers.google.com/bigquery/query-reference#from

note: unlike many other sql-based systems, bigquery uses comma syntax indicate table unions, not joins. means can run query on several tables with compatible !? schemas follows:

you should able modify table schema of old tables add columns, union should match. note can add columns, not remove them. can use tables.patch() method this, or bq update --schema

moreover, long new fields aren't marked required, should considered compatible. if not case, however, bug -- let know if you're experiencing.


Comments

Popular posts from this blog

html - How to style widget with post count different than without post count -

How to remove text and logo OR add Overflow on Android ActionBar using AppCompat on API 8? -

IIS->Tomcat Redirect: multiple worker with default -