BigQuery Organisation

BigQuery is structured as a hierarchy with 4 levels: 

  • Projects: Top-level containers in the Google Cloud Platform that store the data 
  • Datasets: Within projects, datasets hold one or more tables of data 
  • Tables: Within datasets, tables are row-column structures that hold actual data 
  • Jobs: The tasks you are performing on the data, such as running queries, loading data, and exporting data
Projects 
  • Projects are the top-level containers that store the data 
  • Within the project, you can configure settings, permissions, and other metadata that describe your applications 
  • Each project has a name, ID, and number that you’ll use as identifiers 
  • When billing is enabled, each project is associated with one billing account but multiple projects can be billed to the same account
Datasets 
  • Datasets allow you to organize and control access to your tables 
  • All tables must belong to a dataset. You must create a dataset before loading data into BigQuery 
  • You can configure permissions at the organization, project, and dataset level
Tables 
  • Tables contain your data in BigQuery 
  • Each table has a schema that describes the data contained in the table, including field names, types, and descriptions 
  • BigQuery supports the following table types: 
    • Native tables: tables backed by native BigQuery storage 
    • External tables: tables backed by storage external to BigQuery
    • Views: virtual tables defined by a SQL query
Jobs 
  • Jobs are objects that manage asynchronous tasks such as running queries, loading data, and exporting data 
  • You can run multiple jobs concurrently 
  • Completed jobs are listed in the Jobs collection 
  • There are four types of jobs: 
    • Load: load data into a table 
    • Query: run a query against BigQuery data
    • Extract: export a BigQuery table to Google Cloud Storage
    • Copy: copy an existing table into another new or existing table
Example: BigQuery, Datasets, and Tables 
  • Here is an example of the left-pane navigation within BigQuery 
  • Projects are identified by the project name
    • e.g. bigquery-public-data 
  • You can expand projects to see the corresponding datasets and tables 
    • e.g. samples
    • e.g. github_nested 
  • Tables are referenced by their project and dataset as:<project>:<dataset>.<table>  
    • e.g. bigquery-public-data:samples.natality
Example of Simple Schema 
Schema for table Natality under Sample Datasets





Comments

Popular posts from this blog

BigQuery Execution Details

BigQuery Columnar Storage

Features of BigQuery