Table of Contents

Structured Query Language (SQL) was invented in the 1970s by researchers at IBM. It is still used today to manage and manipulate relational databases. It allows you to retrieve, insert, update, and delete data. With the rise of cloud-based platforms such as Google BigQuery ML, AWS Redshift ML, and Azure Synapse, SQL can train, evaluate, and deploy machine learning models directly within databases.

Basic SQL Syntax

Every SQL query follows a specific structure. Below is an example of a simple SQL statement that retrieves data from a database.

SQL queries are made up of key components:

  • SELECT: Defines which columns to retrieve.
  • FROM: Specifies the table to query.
  • WHERE: Filters results based on conditions.
  • ORDER BY: Sorts result in ascending or descending order.
  • GROUP BY: Groups results for aggregation.

Retrieving Data with SELECT

The SELECT statement retrieves data from a specified table. You can choose to select all columns or only specific ones.

Filtering Data with WHERE

The WHERE code allows you to filter results based on conditions. It is useful when you need specific data from a dataset.

Sorting Results with ORDER BY

ORDER BY allows you to sort query results. By default, it sorts in ascending order, but you can specify descending order using DESC.

 

Aggregating Data (SUM, COUNT, AVG, MIN, MAX)

Aggregation functions allow you to summarize data by counting records, calculating averages, and finding the maximum or minimum values.

Joining Multiple Tables (JOIN)

The JOIN clause lets you combine data from multiple tables based on a related column.

Modifying Data (INSERT, UPDATE, DELETE)

You can modify database records using INSERT (to add new data), UPDATE (to modify existing records), and DELETE (to remove records).

Running Machine Learning in BigQuery

Google BigQuery allows machine learning models to be run using SQL queries. Below is an example of how to create a K-means clustering model.