BigQuery Organisation
BigQuery is structured as a hierarchy with 4 levels:
- Projects: Top-level containers in the Google Cloud Platform that store the data
- Datasets: Within projects, datasets hold one or more tables of data
- Tables: Within datasets, tables are row-column structures that hold actual data
- Jobs: The tasks you are performing on the data, such as running queries, loading data, and exporting data
Projects
- Projects are the top-level containers that store the data
- Within the project, you can configure settings, permissions, and other metadata that describe your applications
- Each project has a name, ID, and number that you’ll use as identifiers
- When billing is enabled, each project is associated with one billing account but multiple projects can be billed to the same account
Datasets
- Datasets allow you to organize and control access to your tables
- All tables must belong to a dataset. You must create a dataset before loading data into BigQuery
- You can configure permissions at the organization, project, and dataset level
Tables
- Tables contain your data in BigQuery
- Each table has a schema that describes the data contained in the table, including field names, types, and descriptions
- BigQuery supports the following table types:
- Native tables: tables backed by native BigQuery storage
- External tables: tables backed by storage external to BigQuery
- Views: virtual tables defined by a SQL query
Jobs
- Jobs are objects that manage asynchronous tasks such as running queries, loading data, and exporting data
- You can run multiple jobs concurrently
- Completed jobs are listed in the Jobs collection
- There are four types of jobs:
- Load: load data into a table
- Query: run a query against BigQuery data
- Extract: export a BigQuery table to Google Cloud Storage
- Copy: copy an existing table into another new or existing table
Example: BigQuery, Datasets, and Tables
- Here is an example of the left-pane navigation within BigQuery
- Projects are identified by the project name
- e.g. bigquery-public-data
- You can expand projects to see the corresponding datasets and tables
- e.g. samples
- e.g. github_nested
- Tables are referenced by their project and dataset as:<project>:<dataset>.<table>
- e.g. bigquery-public-data:samples.natality
Comments
Post a Comment