SQL is a standardized programming language specifically designed for managing and manipulating relational databases. Think of SQL as the universal language that allows you to communicate with databases, similar to how English allows people to communicate with each other.
What Makes SQL Special:
- Declarative Language: You tell the database what you want, not how to get it
- Set-Based Operations: Works with entire sets of data rather than individual records
- Standardized: Works across different database systems with minor variations
- English-Like Syntax: Uses words like SELECT, FROM, WHERE that are intuitive
SQL Syntax Rules
Case Sensitivity in SQL
SQL Keywords: In most database systems, SQL keywords (SELECT, FROM, WHERE, etc.) are not case sensitive. This means all these variations work the same way:
- SELECT * FROM customers
- select * from customers
- Select * From Customers
Best Practice: Use UPPERCASE for SQL keywords to improve readability and follow industry conventions.
Database Objects: The case sensitivity of database objects (table names, column names) depends on the database system:
Case Insensitive Systems (MySQL on Windows, SQL Server):
-- All these refer to the same table
SELECT * FROM Customers;
SELECT * FROM customers;
SELECT * FROM CUSTOMERS;
Case Sensitive Systems (PostgreSQL, MySQL on Linux):
-- These refer to different tables
SELECT * FROM Customers; -- Table named "Customers"
SELECT * FROM customers; -- Table named "customers"
SELECT * FROM CUSTOMERS; -- Table named "CUSTOMERS"
Best Practice: Always use consistent naming conventions. Many developers prefer lowercase with underscores (customer_orders) or camelCase (customerOrders).
Statement Termination with Semicolons
The Semicolon Rule: Every SQL statement should end with a semicolon (;). This tells the database engine where one statement ends and another begins.
Why Semicolons Matter:
- Statement Separation: Allows multiple statements in one execution
- Clarity: Makes code more readable and organized
- Parser Requirements: Some database systems require semicolons
Examples of Proper Statement Termination:
-- Single statement
SELECT customer_name FROM customers WHERE city = 'New York';
-- Multiple statements
CREATE TABLE products (id INT, name VARCHAR(100));
INSERT INTO products VALUES (1, 'Laptop');
SELECT * FROM products;
When Semicolons Are Optional:
- Single statement execution in some database tools
- Last statement in a batch (varies by system)
- Interactive mode in some command-line tools
Best Practice: Always use semicolons regardless of whether they’re required. This ensures your SQL works consistently across different environments.
SQL Statement Structure
Basic SQL Statement Pattern: Every SQL statement follows a logical structure with specific clauses in a particular order:
SELECT column1, column2 -- What data to retrieve
FROM table_name -- Where to get it from
WHERE condition -- What conditions to apply
GROUP BY column -- How to group results
HAVING group_condition -- Conditions on groups
ORDER BY column -- How to sort results
LIMIT number; -- How many results to return
Order Matters: SQL clauses must appear in the correct sequence. You cannot put WHERE before FROM, for example.
Data Types in SQL
Data types define what kind of information can be stored in each column of a database table. Think of them as rules that tell the database how to store, validate, and process data.
Numeric Data Types
INTEGER Types
Purpose: Store whole numbers (no decimal points) Variations:
- TINYINT: Very small integers (usually 0 to 255 or -128 to 127)
- SMALLINT: Small integers (usually -32,768 to 32,767)
- INT or INTEGER: Standard integers (usually -2 billion to +2 billion)
- BIGINT: Large integers (for very large numbers)
Real World Examples:
-- Employee table
CREATE TABLE employees (
employee_id INT, -- Unique identifier
age TINYINT, -- Age (0-120 range)
salary INT, -- Annual salary
total_sales BIGINT -- Lifetime sales (could be very large)
);
When to Use Integers:
- Counting items (quantity, number of orders)
- Identifiers (customer ID, product ID)
- Age, year, or other whole number measurements
DECIMAL and FLOAT Types
DECIMAL/NUMERIC: Exact decimal numbers with fixed precision FLOAT/REAL: Approximate decimal numbers (faster but less precise)
Understanding Precision and Scale:
- Precision: Total number of digits
- Scale: Number of digits after decimal point
- DECIMAL(10,2): Up to 10 total digits, 2 after decimal (like 12345678.99)
Real World Examples:
-- Financial data where precision matters
CREATE TABLE financial_records (
transaction_id INT,
amount DECIMAL(15,2), -- Money values (exact)
interest_rate DECIMAL(5,4), -- Percentage with 4 decimal places
calculated_value FLOAT -- Scientific calculations (approximate)
);
When to Use Each:
- DECIMAL: Money, percentages, any value requiring exact precision
- FLOAT: Scientific calculations, measurements where small inaccuracies are acceptable
String Data Types
Character Types
CHAR(n): Fixed-length strings VARCHAR(n): Variable-length strings TEXT: Large text blocks
Understanding Fixed vs Variable Length:
CHAR Examples:
-- State codes are always 2 characters
state_code CHAR(2) -- 'NY', 'CA', 'TX'
-- Product codes with fixed format
product_code CHAR(10) -- 'PROD000001', 'PROD000002'
If you store ‘NY’ in CHAR(10), it uses full 10 characters with padding spaces.
VARCHAR Examples:
-- Names vary in length
first_name VARCHAR(50) -- 'John' uses 4 characters, 'Alexander' uses 9
-- Email addresses
email VARCHAR(255) -- Uses only as much space as needed
TEXT for Large Content:
-- Blog posts, comments, descriptions
CREATE TABLE articles (
id INT,
title VARCHAR(200),
content TEXT, -- Can store thousands of characters
summary VARCHAR(500)
);
Choosing String Types:
- CHAR: When length is always the same (codes, flags)
- VARCHAR: When length varies but has a reasonable maximum
- TEXT: When content can be very long and length is unpredictable
Date and Time Types
Understanding Date/Time Storage
Databases need to store temporal data in ways that allow for calculations, comparisons, and formatting.
DATE: Stores year, month, and day TIME: Stores hour, minute, second (and sometimes microseconds) DATETIME/TIMESTAMP: Stores both date and time together
Real World Examples:
CREATE TABLE events (
event_id INT,
event_name VARCHAR(100),
event_date DATE, -- '2024-12-25'
start_time TIME, -- '14:30:00'
created_at DATETIME, -- '2024-12-25 14:30:00'
last_updated TIMESTAMP -- Includes timezone information
);
Date Format Standards: Most databases use ISO format: YYYY-MM-DD for dates and HH:MM:SS for times.
Practical Applications:
-- Scheduling system
CREATE TABLE appointments (
appointment_id INT,
patient_name VARCHAR(100),
appointment_date DATE, -- When is the appointment
appointment_time TIME, -- What time
duration_minutes INT, -- How long
created_on TIMESTAMP -- When was this record created
);
Boolean Data Types
BOOLEAN/BOOL: Stores true/false values Some databases use TINYINT(1) or BIT to represent boolean values.
Understanding Boolean Logic:
- TRUE: Represents yes, on, active, enabled
- FALSE: Represents no, off, inactive, disabled
- NULL: Unknown or undefined state
Real World Examples:
CREATE TABLE user_preferences (
user_id INT,
email_notifications BOOLEAN, -- TRUE = wants emails
is_premium_member BOOLEAN, -- TRUE = premium account
account_active BOOLEAN, -- TRUE = account is active
newsletter_subscribed BOOLEAN -- TRUE = subscribed to newsletter
);
Practical Usage: Boolean columns are perfect for:
- Feature flags (is_enabled, is_visible)
- Status indicators (is_active, is_deleted)
- User preferences (wants_notifications, accepts_marketing)
- Conditional logic (is_paid, is_shipped, is_completed)
Choosing the Right Data Type
Performance Considerations:
- Smaller data types use less storage and memory
- Appropriate data types enable better indexing
- String operations are generally slower than numeric operations
Data Integrity Benefits:
- Correct data types prevent invalid data entry
- Enable database constraints and validations
- Support proper sorting and comparison operations
Example of Poor vs Good Data Type Choices:
Poor Choice:
CREATE TABLE products (
price VARCHAR(20), -- Storing numbers as text
in_stock VARCHAR(10), -- Storing boolean as text
created_date VARCHAR(50) -- Storing dates as text
);
Good Choice:
CREATE TABLE products (
price DECIMAL(10,2), -- Proper numeric type
in_stock BOOLEAN, -- Proper boolean type
created_date DATETIME -- Proper date type
);
Operators in SQL
Operators are symbols or words that perform operations on data values. They’re the building blocks for creating conditions, calculations, and logical expressions in SQL.
Arithmetic Operators
Arithmetic operators perform mathematical calculations on numeric data.
Basic Arithmetic Operations
Addition (+): Adds two numbers together Subtraction (-): Subtracts the second number from the first Multiplication (*): Multiplies two numbers Division (/): Divides the first number by the second Modulus (%): Returns the remainder after division
Real World Examples:
Calculating Totals:
-- Calculate order total with tax
SELECT
order_id,
subtotal,
tax_rate,
subtotal * tax_rate AS tax_amount,
subtotal + (subtotal * tax_rate) AS total_amount
FROM orders;
Price Calculations:
-- Calculate discounted prices
SELECT
product_name,
original_price,
discount_percentage,
original_price * (discount_percentage / 100) AS discount_amount,
original_price - (original_price * discount_percentage / 100) AS sale_price
FROM products;
Inventory Management:
-- Calculate remaining stock after orders
SELECT
product_id,
current_stock,
ordered_quantity,
current_stock - ordered_quantity AS remaining_stock
FROM inventory;
Advanced Arithmetic Applications
Time Calculations:
-- Calculate employee work hours
SELECT
employee_name,
hours_worked,
hourly_rate,
hours_worked * hourly_rate AS gross_pay,
(hours_worked * hourly_rate) * 0.20 AS tax_deduction,
(hours_worked * hourly_rate) - ((hours_worked * hourly_rate) * 0.20) AS net_pay
FROM payroll;
Performance Metrics:
-- Calculate conversion rates
SELECT
campaign_name,
total_clicks,
conversions,
(conversions * 100.0 / total_clicks) AS conversion_rate_percentage
FROM marketing_campaigns;
Comparison Operators
Comparison operators compare values and return true or false results. They’re essential for filtering data and creating conditions.
Equality and Inequality
Equal (=): Checks if values are the same Not Equal (!= or <>): Checks if values are different Greater Than (>): First value is larger than second Less Than (<): First value is smaller than second Greater Than or Equal (>=): First value is larger than or equal to second Less Than or Equal (<=): First value is smaller than or equal to second
Practical Applications:
Customer Segmentation:
-- Find high-value customers
SELECT customer_name, total_purchases
FROM customers
WHERE total_purchases >= 10000;
-- Find recent customers
SELECT customer_name, registration_date
FROM customers
WHERE registration_date > '2024-01-01';
Inventory Management:
-- Find low stock items
SELECT product_name, current_stock
FROM products
WHERE current_stock <= 10;
-- Find overpriced items
SELECT product_name, price
FROM products
WHERE price > 1000;
Employee Management:
-- Find eligible employees for promotion
SELECT employee_name, years_experience, performance_rating
FROM employees
WHERE years_experience >= 5 AND performance_rating >= 4.0;
Special Comparison Operators
BETWEEN: Checks if a value falls within a range IN: Checks if a value matches any value in a list LIKE: Checks if text matches a pattern IS NULL / IS NOT NULL: Checks for missing values
BETWEEN Examples:
-- Find products in a price range
SELECT product_name, price
FROM products
WHERE price BETWEEN 100 AND 500;
-- Find orders from last quarter
SELECT order_id, order_date
FROM orders
WHERE order_date BETWEEN '2024-01-01' AND '2024-03-31';
IN Examples:
-- Find customers from specific cities
SELECT customer_name, city
FROM customers
WHERE city IN ('New York', 'Los Angeles', 'Chicago');
-- Find orders with specific statuses
SELECT order_id, status
FROM orders
WHERE status IN ('Pending', 'Processing', 'Shipped');
Logical Operators
Logical operators combine multiple conditions to create complex filtering logic.
Core Logical Operators
AND: All conditions must be true
OR: At least one condition must be true
NOT: Reverses the truth value of a condition
Understanding Logical Combinations:
AND Operator Applications:
-- Find premium customers in specific regions
SELECT customer_name, membership_level, region
FROM customers
WHERE membership_level = 'Premium' AND region = 'North America';
-- Find products that are both in stock and reasonably priced
SELECT product_name, stock_quantity, price
FROM products
WHERE stock_quantity > 0 AND price <= 100;
OR Operator Applications:
-- Find urgent orders (either high priority or express shipping)
SELECT order_id, priority, shipping_method
FROM orders
WHERE priority = 'High' OR shipping_method = 'Express';
-- Find employees in sales or marketing departments
SELECT employee_name, department
FROM employees
WHERE department = 'Sales' OR department = 'Marketing';
NOT Operator Applications:
-- Find customers who are not from California
SELECT customer_name, state
FROM customers
WHERE NOT state = 'California';
-- Find products that are not discontinued
SELECT product_name, status
FROM products
WHERE NOT status = 'Discontinued';
Complex Logical Combinations
Combining Multiple Operators:
-- Find high-value active customers, excluding internal accounts
SELECT customer_name, account_type, total_purchases, is_active
FROM customers
WHERE (total_purchases > 5000 AND is_active = TRUE)
AND NOT account_type = 'Internal';
Parentheses for Precedence:
-- Find products that are either (cheap and in stock) or (expensive and on sale)
SELECT product_name, price, stock_quantity, on_sale
FROM products
WHERE (price < 50 AND stock_quantity > 0)
OR (price > 200 AND on_sale = TRUE);
Operator Precedence and Best Practices
Order of Operations in SQL:
- Parentheses (highest priority)
- Multiplication and Division
- Addition and Subtraction
- Comparison operators
- NOT
- AND
- OR (lowest priority)
Best Practices for Using Operators:
Use Parentheses for Clarity:
-- Clear and readable
WHERE (status = 'Active' AND region = 'West')
OR (status = 'Pending' AND priority = 'High')
-- Ambiguous without parentheses
WHERE status = 'Active' AND region = 'West'
OR status = 'Pending' AND priority = 'High'
Consistent Formatting:
-- Well-formatted condition
WHERE customer_type = 'Premium'
AND total_purchases >= 1000
AND last_order_date >= '2024-01-01'
AND region IN ('North', 'South', 'East');
Performance Considerations:
- Place most selective conditions first in AND chains
- Use appropriate data types for comparisons
- Consider indexing columns used in WHERE conditions frequently
Understanding these SQL basics provides the foundation for writing effective database queries and managing data efficiently. Each concept builds upon the others to create a comprehensive toolkit for database interaction.