SQL Syntax and Datatypes

SQL is a standardized programming language specifically designed for managing and manipulating relational databases. Think of SQL as the universal language that allows you to communicate with databases, similar to how English allows people to communicate with each other.

What Makes SQL Special:

  • Declarative Language: You tell the database what you want, not how to get it
  • Set-Based Operations: Works with entire sets of data rather than individual records
  • Standardized: Works across different database systems with minor variations
  • English-Like Syntax: Uses words like SELECT, FROM, WHERE that are intuitive

SQL Syntax Rules

Case Sensitivity in SQL

SQL Keywords: In most database systems, SQL keywords (SELECT, FROM, WHERE, etc.) are not case sensitive. This means all these variations work the same way:

  • SELECT * FROM customers
  • select * from customers
  • Select * From Customers

Best Practice: Use UPPERCASE for SQL keywords to improve readability and follow industry conventions.

Database Objects: The case sensitivity of database objects (table names, column names) depends on the database system:

Case Insensitive Systems (MySQL on Windows, SQL Server):

-- All these refer to the same table
SELECT * FROM Customers;
SELECT * FROM customers;
SELECT * FROM CUSTOMERS;

Case Sensitive Systems (PostgreSQL, MySQL on Linux):

-- These refer to different tables
SELECT * FROM Customers;  -- Table named "Customers"
SELECT * FROM customers;  -- Table named "customers"
SELECT * FROM CUSTOMERS;  -- Table named "CUSTOMERS"

Best Practice: Always use consistent naming conventions. Many developers prefer lowercase with underscores (customer_orders) or camelCase (customerOrders).

Statement Termination with Semicolons

The Semicolon Rule: Every SQL statement should end with a semicolon (;). This tells the database engine where one statement ends and another begins.

Why Semicolons Matter:

  • Statement Separation: Allows multiple statements in one execution
  • Clarity: Makes code more readable and organized
  • Parser Requirements: Some database systems require semicolons

Examples of Proper Statement Termination:

-- Single statement
SELECT customer_name FROM customers WHERE city = 'New York';

-- Multiple statements
CREATE TABLE products (id INT, name VARCHAR(100));
INSERT INTO products VALUES (1, 'Laptop');
SELECT * FROM products;

When Semicolons Are Optional:

  • Single statement execution in some database tools
  • Last statement in a batch (varies by system)
  • Interactive mode in some command-line tools

Best Practice: Always use semicolons regardless of whether they’re required. This ensures your SQL works consistently across different environments.

SQL Statement Structure

Basic SQL Statement Pattern: Every SQL statement follows a logical structure with specific clauses in a particular order:

SELECT column1, column2          -- What data to retrieve
FROM table_name                  -- Where to get it from
WHERE condition                  -- What conditions to apply
GROUP BY column                  -- How to group results
HAVING group_condition           -- Conditions on groups
ORDER BY column                  -- How to sort results
LIMIT number;                    -- How many results to return

Order Matters: SQL clauses must appear in the correct sequence. You cannot put WHERE before FROM, for example.

Data Types in SQL

Data types define what kind of information can be stored in each column of a database table. Think of them as rules that tell the database how to store, validate, and process data.

Numeric Data Types

INTEGER Types

Purpose: Store whole numbers (no decimal points) Variations:

  • TINYINT: Very small integers (usually 0 to 255 or -128 to 127)
  • SMALLINT: Small integers (usually -32,768 to 32,767)
  • INT or INTEGER: Standard integers (usually -2 billion to +2 billion)
  • BIGINT: Large integers (for very large numbers)

Real World Examples:

-- Employee table
CREATE TABLE employees (
    employee_id INT,           -- Unique identifier
    age TINYINT,              -- Age (0-120 range)
    salary INT,               -- Annual salary
    total_sales BIGINT        -- Lifetime sales (could be very large)
);

When to Use Integers:

  • Counting items (quantity, number of orders)
  • Identifiers (customer ID, product ID)
  • Age, year, or other whole number measurements

DECIMAL and FLOAT Types

DECIMAL/NUMERIC: Exact decimal numbers with fixed precision FLOAT/REAL: Approximate decimal numbers (faster but less precise)

Understanding Precision and Scale:

  • Precision: Total number of digits
  • Scale: Number of digits after decimal point
  • DECIMAL(10,2): Up to 10 total digits, 2 after decimal (like 12345678.99)

Real World Examples:

-- Financial data where precision matters
CREATE TABLE financial_records (
    transaction_id INT,
    amount DECIMAL(15,2),      -- Money values (exact)
    interest_rate DECIMAL(5,4), -- Percentage with 4 decimal places
    calculated_value FLOAT     -- Scientific calculations (approximate)
);

When to Use Each:

  • DECIMAL: Money, percentages, any value requiring exact precision
  • FLOAT: Scientific calculations, measurements where small inaccuracies are acceptable

String Data Types

Character Types

CHAR(n): Fixed-length strings VARCHAR(n): Variable-length strings TEXT: Large text blocks

Understanding Fixed vs Variable Length:

CHAR Examples:

-- State codes are always 2 characters
state_code CHAR(2)     -- 'NY', 'CA', 'TX'

-- Product codes with fixed format
product_code CHAR(10)  -- 'PROD000001', 'PROD000002'

If you store ‘NY’ in CHAR(10), it uses full 10 characters with padding spaces.

VARCHAR Examples:

-- Names vary in length
first_name VARCHAR(50)    -- 'John' uses 4 characters, 'Alexander' uses 9

-- Email addresses
email VARCHAR(255)        -- Uses only as much space as needed

TEXT for Large Content:

-- Blog posts, comments, descriptions
CREATE TABLE articles (
    id INT,
    title VARCHAR(200),
    content TEXT,             -- Can store thousands of characters
    summary VARCHAR(500)
);

Choosing String Types:

  • CHAR: When length is always the same (codes, flags)
  • VARCHAR: When length varies but has a reasonable maximum
  • TEXT: When content can be very long and length is unpredictable

Date and Time Types

Understanding Date/Time Storage

Databases need to store temporal data in ways that allow for calculations, comparisons, and formatting.

DATE: Stores year, month, and day TIME: Stores hour, minute, second (and sometimes microseconds) DATETIME/TIMESTAMP: Stores both date and time together

Real World Examples:

CREATE TABLE events (
    event_id INT,
    event_name VARCHAR(100),
    event_date DATE,              -- '2024-12-25'
    start_time TIME,              -- '14:30:00'
    created_at DATETIME,          -- '2024-12-25 14:30:00'
    last_updated TIMESTAMP        -- Includes timezone information
);

Date Format Standards: Most databases use ISO format: YYYY-MM-DD for dates and HH:MM:SS for times.

Practical Applications:

-- Scheduling system
CREATE TABLE appointments (
    appointment_id INT,
    patient_name VARCHAR(100),
    appointment_date DATE,        -- When is the appointment
    appointment_time TIME,        -- What time
    duration_minutes INT,         -- How long
    created_on TIMESTAMP         -- When was this record created
);

Boolean Data Types

BOOLEAN/BOOL: Stores true/false values Some databases use TINYINT(1) or BIT to represent boolean values.

Understanding Boolean Logic:

  • TRUE: Represents yes, on, active, enabled
  • FALSE: Represents no, off, inactive, disabled
  • NULL: Unknown or undefined state

Real World Examples:

CREATE TABLE user_preferences (
    user_id INT,
    email_notifications BOOLEAN,    -- TRUE = wants emails
    is_premium_member BOOLEAN,      -- TRUE = premium account
    account_active BOOLEAN,         -- TRUE = account is active
    newsletter_subscribed BOOLEAN   -- TRUE = subscribed to newsletter
);

Practical Usage: Boolean columns are perfect for:

  • Feature flags (is_enabled, is_visible)
  • Status indicators (is_active, is_deleted)
  • User preferences (wants_notifications, accepts_marketing)
  • Conditional logic (is_paid, is_shipped, is_completed)

Choosing the Right Data Type

Performance Considerations:

  • Smaller data types use less storage and memory
  • Appropriate data types enable better indexing
  • String operations are generally slower than numeric operations

Data Integrity Benefits:

  • Correct data types prevent invalid data entry
  • Enable database constraints and validations
  • Support proper sorting and comparison operations

Example of Poor vs Good Data Type Choices:

Poor Choice:

CREATE TABLE products (
    price VARCHAR(20),        -- Storing numbers as text
    in_stock VARCHAR(10),     -- Storing boolean as text
    created_date VARCHAR(50)  -- Storing dates as text
);

Good Choice:

CREATE TABLE products (
    price DECIMAL(10,2),      -- Proper numeric type
    in_stock BOOLEAN,         -- Proper boolean type
    created_date DATETIME     -- Proper date type
);

Operators in SQL

Operators are symbols or words that perform operations on data values. They’re the building blocks for creating conditions, calculations, and logical expressions in SQL.

Arithmetic Operators

Arithmetic operators perform mathematical calculations on numeric data.

Basic Arithmetic Operations

Addition (+): Adds two numbers together Subtraction (-): Subtracts the second number from the first Multiplication (*): Multiplies two numbers Division (/): Divides the first number by the second Modulus (%): Returns the remainder after division

Real World Examples:

Calculating Totals:

-- Calculate order total with tax
SELECT 
    order_id,
    subtotal,
    tax_rate,
    subtotal * tax_rate AS tax_amount,
    subtotal + (subtotal * tax_rate) AS total_amount
FROM orders;

Price Calculations:

-- Calculate discounted prices
SELECT 
    product_name,
    original_price,
    discount_percentage,
    original_price * (discount_percentage / 100) AS discount_amount,
    original_price - (original_price * discount_percentage / 100) AS sale_price
FROM products;

Inventory Management:

-- Calculate remaining stock after orders
SELECT 
    product_id,
    current_stock,
    ordered_quantity,
    current_stock - ordered_quantity AS remaining_stock
FROM inventory;

Advanced Arithmetic Applications

Time Calculations:

-- Calculate employee work hours
SELECT 
    employee_name,
    hours_worked,
    hourly_rate,
    hours_worked * hourly_rate AS gross_pay,
    (hours_worked * hourly_rate) * 0.20 AS tax_deduction,
    (hours_worked * hourly_rate) - ((hours_worked * hourly_rate) * 0.20) AS net_pay
FROM payroll;

Performance Metrics:

-- Calculate conversion rates
SELECT 
    campaign_name,
    total_clicks,
    conversions,
    (conversions * 100.0 / total_clicks) AS conversion_rate_percentage
FROM marketing_campaigns;

Comparison Operators

Comparison operators compare values and return true or false results. They’re essential for filtering data and creating conditions.

Equality and Inequality

Equal (=): Checks if values are the same Not Equal (!= or <>): Checks if values are different Greater Than (>): First value is larger than second Less Than (<): First value is smaller than second Greater Than or Equal (>=): First value is larger than or equal to second Less Than or Equal (<=): First value is smaller than or equal to second

Practical Applications:

Customer Segmentation:

-- Find high-value customers
SELECT customer_name, total_purchases
FROM customers
WHERE total_purchases >= 10000;

-- Find recent customers
SELECT customer_name, registration_date
FROM customers
WHERE registration_date > '2024-01-01';

Inventory Management:

-- Find low stock items
SELECT product_name, current_stock
FROM products
WHERE current_stock <= 10;

-- Find overpriced items
SELECT product_name, price
FROM products
WHERE price > 1000;

Employee Management:

-- Find eligible employees for promotion
SELECT employee_name, years_experience, performance_rating
FROM employees
WHERE years_experience >= 5 AND performance_rating >= 4.0;

Special Comparison Operators

BETWEEN: Checks if a value falls within a range IN: Checks if a value matches any value in a list LIKE: Checks if text matches a pattern IS NULL / IS NOT NULL: Checks for missing values

BETWEEN Examples:

-- Find products in a price range
SELECT product_name, price
FROM products
WHERE price BETWEEN 100 AND 500;

-- Find orders from last quarter
SELECT order_id, order_date
FROM orders
WHERE order_date BETWEEN '2024-01-01' AND '2024-03-31';

IN Examples:

-- Find customers from specific cities
SELECT customer_name, city
FROM customers
WHERE city IN ('New York', 'Los Angeles', 'Chicago');

-- Find orders with specific statuses
SELECT order_id, status
FROM orders
WHERE status IN ('Pending', 'Processing', 'Shipped');

Logical Operators

Logical operators combine multiple conditions to create complex filtering logic.

Core Logical Operators

AND: All conditions must be true OR: At least one condition must be true
NOT: Reverses the truth value of a condition

Understanding Logical Combinations:

AND Operator Applications:

-- Find premium customers in specific regions
SELECT customer_name, membership_level, region
FROM customers
WHERE membership_level = 'Premium' AND region = 'North America';

-- Find products that are both in stock and reasonably priced
SELECT product_name, stock_quantity, price
FROM products
WHERE stock_quantity > 0 AND price <= 100;

OR Operator Applications:

-- Find urgent orders (either high priority or express shipping)
SELECT order_id, priority, shipping_method
FROM orders
WHERE priority = 'High' OR shipping_method = 'Express';

-- Find employees in sales or marketing departments
SELECT employee_name, department
FROM employees
WHERE department = 'Sales' OR department = 'Marketing';

NOT Operator Applications:

-- Find customers who are not from California
SELECT customer_name, state
FROM customers
WHERE NOT state = 'California';

-- Find products that are not discontinued
SELECT product_name, status
FROM products
WHERE NOT status = 'Discontinued';

Complex Logical Combinations

Combining Multiple Operators:

-- Find high-value active customers, excluding internal accounts
SELECT customer_name, account_type, total_purchases, is_active
FROM customers
WHERE (total_purchases > 5000 AND is_active = TRUE) 
  AND NOT account_type = 'Internal';

Parentheses for Precedence:

-- Find products that are either (cheap and in stock) or (expensive and on sale)
SELECT product_name, price, stock_quantity, on_sale
FROM products
WHERE (price < 50 AND stock_quantity > 0) 
   OR (price > 200 AND on_sale = TRUE);

Operator Precedence and Best Practices

Order of Operations in SQL:

  1. Parentheses (highest priority)
  2. Multiplication and Division
  3. Addition and Subtraction
  4. Comparison operators
  5. NOT
  6. AND
  7. OR (lowest priority)

Best Practices for Using Operators:

Use Parentheses for Clarity:

-- Clear and readable
WHERE (status = 'Active' AND region = 'West') 
   OR (status = 'Pending' AND priority = 'High')

-- Ambiguous without parentheses
WHERE status = 'Active' AND region = 'West' 
   OR status = 'Pending' AND priority = 'High'

Consistent Formatting:

-- Well-formatted condition
WHERE customer_type = 'Premium'
  AND total_purchases >= 1000
  AND last_order_date >= '2024-01-01'
  AND region IN ('North', 'South', 'East');

Performance Considerations:

  • Place most selective conditions first in AND chains
  • Use appropriate data types for comparisons
  • Consider indexing columns used in WHERE conditions frequently

Understanding these SQL basics provides the foundation for writing effective database queries and managing data efficiently. Each concept builds upon the others to create a comprehensive toolkit for database interaction.

Track your progress

Mark this subtopic as completed when you finish reading.