BigQuery Organisation

BigQuery Organisation

BigQuery is structured as a hierarchy with 4 levels:

Projects: Top-level containers in the Google Cloud Platform that store the data
Datasets: Within projects, datasets hold one or more tables of data
Tables: Within datasets, tables are row-column structures that hold actual data
Jobs: The tasks you are performing on the data, such as running queries, loading data, and exporting data

Projects

Projects are the top-level containers that store the data
Within the project, you can configure settings, permissions, and other metadata that describe your applications
Each project has a name, ID, and number that you’ll use as identifiers
When billing is enabled, each project is associated with one billing account but multiple projects can be billed to the same account

Datasets

Datasets allow you to organize and control access to your tables
All tables must belong to a dataset. You must create a dataset before loading data into BigQuery
You can configure permissions at the organization, project, and dataset level

Tables

Tables contain your data in BigQuery
Each table has a schema that describes the data contained in the table, including field names, types, and descriptions
BigQuery supports the following table types:

Native tables: tables backed by native BigQuery storage
External tables: tables backed by storage external to BigQuery
Views: virtual tables defined by a SQL query

Jobs

Jobs are objects that manage asynchronous tasks such as running queries, loading data, and exporting data
You can run multiple jobs concurrently
Completed jobs are listed in the Jobs collection
There are four types of jobs:

Load: load data into a table
Query: run a query against BigQuery data
Extract: export a BigQuery table to Google Cloud Storage
Copy: copy an existing table into another new or existing table

Example: BigQuery, Datasets, and Tables

Here is an example of the left-pane navigation within BigQuery
Projects are identified by the project name

e.g. bigquery-public-data

You can expand projects to see the corresponding datasets and tables

e.g. samples
e.g. github_nested

Tables are referenced by their project and dataset as:<project>:<dataset>.<table>

e.g. bigquery-public-data:samples.natality

Example of Simple Schema

Schema for table Natality under Sample Datasets

Comments