Skip to main content

Database Theory

Imagine databases as the foundation of a house for your AI Agents. Here, we'll break down how to design "memory stores" for neural networks so that your systems can quickly find the data they need, like a librarian in a giant book universe. You'll learn to choose between SQL and NoSQL (like picking keys to different safes), optimize queries (like building expressways for data), and ensure transaction reliability - it's these skills that will turn you from a novice into an architect of industrial AI systems.

Ask AI Instructions
instruction

Since these topics don't change over time, it's best to study them with a personal tutor - ChatGPT.

The learning process should be as follows:

  • You create a system prompt for ChatGPT (templates), where you describe your background, preferences, level of detail in explanations, etc.
  • Copy the topic from the list (triple click), and ask ChatGPT to explain this topic to you
  • If you want to delve deeper, ask clarifying questions

Currently, this is the most convenient way to learn the basics. In addition to the concepts, you can study additional materials in the Gold, Silver, and Extra sections.

  1. Gold - be sure to study before communicating with ChatGPT
  2. Ask AI - ask questions on each unfamiliar topic
  3. Silver - secondary materials
  4. Extra - in-depth topics

Golden

bytebyte db playlist - https://youtube.com/playlist?list=PLCRMIe5FDPsdnSszazqVIQFh99t1ExH19&si=3tVd_dGWxmtQ2pQo

Fireship

Ask AI

Basic Database Concepts

  1. Database architecture: basic components and terms (Foundation)
  2. Relational DBs: from normalization to relationships between tables
  3. Keys in DBs: designing relationships and data integrity
  4. CRUD operations: data handling patterns (Workshop)

SQL and Data Handling

  1. SQL: syntax for everyday tasks (Cheat Sheet)
  2. SELECT queries: advanced filtering techniques
  3. JOIN operations: visualizing relationships between tables
  4. Transactions: writing atomic operations

NoSQL and Modern Data Storage Approaches

  1. SQL vs NoSQL: comparing data models

    SQL vs NoSQL: comparing data models
  2. Document-oriented DBs: MongoDB and working with JSON data

  3. ACID properties: transactions in databases

    ACID properties: transactions in databases

Optimization and Practical Aspects

  1. Indexes in databases: principles of operation and optimization

  2. ORM: advantages and disadvantages

  3. Relationships in databases: one-to-one, one-to-many, many-to-many

  4. Pagination: offset vs cursor-based approaches

    Pagination: offset vs cursor-based approaches
  5. Key-value stores: Redis and caching (Overview)

  6. Graph DBs: basics of working with Neo4j (Concept)

Silver

I replaced my entire tech stack with Postgres...

Exercises

Exercise: Optimizing slow queries in PostgreSQL

Situation: We have a user_actions table in PostgreSQL with 10 million records:

CREATE TABLE user_actions (
id SERIAL PRIMARY KEY,
user_id INT NOT NULL,
action_type VARCHAR(50) NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
metadata JSONB
);

Problem: The following query takes 2.8 seconds to execute:

SELECT * FROM user_actions 
WHERE user_id = 123
AND action_type = 'search'
ORDER BY created_at DESC
LIMIT 100;

Task:

  1. Explain why the query is slow
  2. Suggest a solution using indexes
  3. Write SQL to create the index
  4. Explain the trade-offs of your solution
Answer
  1. Reason for slow operation: Lack of a suitable index for WHERE and ORDER BY conditions
  2. Solution: Create a composite index covering filtering and sorting
  3. SQL:
CREATE INDEX idx_user_actions_composite ON user_actions 
(user_id, action_type, created_at DESC);
  1. Trade-offs:
  • Increase in DB size by ~15-20%
  • Slight slowdown of INSERT/UPDATE operations
  • It is recommended to use CONCURRENTLY for creation in production:
CREATE INDEX CONCURRENTLY idx_user_actions_composite ...;