Published 11月 05, 2021 by with 0 comment

GSP072 - BigQuery Qwik Start - Console


Overview
Storing and querying massive datasets can be time consuming and expensive without the right hardware and infrastructure. BigQuery is an enterprise data warehouse that solves this problem by enabling super-fast SQL queries using the processing power of Google's infrastructure. Simply move your data into BigQuery and let us handle the hard work. You can control access to both the project and your data based on your business needs, such as giving others the ability to view or query your data.


You can access BigQuery in the Console, the classic Web UI or a command-line tool, or by making calls to the BigQuery REST API using a variety of client libraries such as Java, .NET, or Python. There are also a variety of third-party tools that you can use to interact with BigQuery, such as visualizing the data or loading the data.

This hands-on lab shows you how to use the Web UI to query public tables and load sample data into BigQuery.


Source:
This lab is from Qwiklabs.


Open BigQuery
The BigQuery console provides an interface to query tables, including public datasets offered by BigQuery. The query you will run accesses a table from a public dataset that BigQuery provides. It uses standard query language to search the dataset, and limits the results returned to 10.

Open BigQuery Console
In the Google Cloud Console, select Navigation menu > BigQuery:

The Welcome to BigQuery in the Cloud Console message box opens. This message box provides a link to the quickstart guide and the release notes.

Click Done.

The BigQuery console opens.


Query a public dataset
1. Copy and paste the following query into the BigQuery Query editor,:
#standardSQL
SELECT
 weight_pounds, state, year, gestation_weeks
FROM
 `bigquery-public-data.samples.natality`
ORDER BY weight_pounds DESC LIMIT 10;

This data sample holds information about US natality (birth rates).

A green or red check displays depending on whether the query is valid or invalid. If the query is valid, the validator also describes the amount of data to be processed after you run the query.

This information helps determine the cost to run a query.

2. Click the Run button.
Your query results should resemble the following:

You can browse the schema of other public datasets in BigQuery by clicking + ADD DATA > Explore pubic datasets, then search for "bigquery public data" in the Search field.


Load custom data into a table
To load custom data into a table, you perform the following tasks:
Create a dataset
Create a table
Add data to your project (to a storage bucket)
Load the data from the bucket to the table you created

Create a dataset
Datasets help control access to tables and views in a project. This lab uses only one table, but you still need a dataset to hold the table.

1. In the left pane, near your project id click on view actions then click Create Dataset in the Explorer section.

2. Set Dataset ID to babynames.

3. Leave all other fields at their default settings. Click Create dataset.

Now you have a dataset.


Add custom data
The custom data file you'll use contains approximately 7 MB of data about popular baby names, provided by the US Social Security Administration. You'll add the zip file to your project then create a storage bucket for the specific file that you'll need to query against.

In Cloud Shell, run the following commands to add the data files to your project:
gsutil cp gs://spls/gsp072/baby-names.zip .

unzip baby-names.zip


Create a Cloud Storage bucket
Now create a Cloud Storage bucket to hold the data files you downloaded.

1. In the Cloud Console, select Navigation menu > Cloud Storage > Browser, and then click Create bucket.
Note: If prompted, Click LEAVE for Unsaved work.

2. Give your bucket a universally unique name, then click Create.

3. In Cloud Shell, run the following to move file yob2014.txt into your bucket. Replace <your_bucket> with the name of the bucket you just created:
gsutil cp yob2014.txt gs://<your_bucket>



Load the data into a new table
Next you create a table inside the babynames dataset, then load the data file from your storage bucket into the new table.

1. In the Cloud Console, select Navigation menu > BigQuery to return to the BigQuery console.

2. Navigate to the babynames dataset, by clicking view actions near your dataset then click Open and click Create table.

3. In the Create table dialog, set the following fields, leave all others at the default value:


4. Click the Create Table button.
When BigQuery is finished creating the table and loading the data, you see the names_2014 table under the babynames dataset.


Preview the table
Check your table! View the first few rows of the data.

Click the names_2014 table in the left-hand menu, then click Preview.

Your table is ready for queries.


Query a custom dataset
Running a query against custom data is identical to querying a public dataset that you did earlier, except that now you're querying your own table instead of a public table.

In BigQuery, click the Compose New query button in the top right corner to clear out your previous query.

Paste or type the following query into the Query editor.

Note: If your table name is something other than babynames, update the code with your table name.
#standardSQL
SELECT
 name, count
FROM
 `babynames.names_2014`
WHERE
 gender = 'M'
ORDER BY count DESC LIMIT 5;

Click the Run button. The query displays the top 5 boys names for the year of data (2014) you loaded into the table.

Query a custom dataset


Congratulations!
You used the BigQuery Web UI to query public tables and load sample data into BigQuery.


Reference:
1. Qwiklabs

2. Google Cloud Certification - Associate Cloud Engineer

3. Learn more about BigQuery

4. Get Meaningful Insights with Google BigQuery | Google Cloud Labs


最初發表 / 最後更新: 2021.11.04 / 2021.11.04

0 comments:

張貼留言