Data Search And Knowledge Systems

ROADMAP

24 tracks / 695 lessons

TRACKS

[HIDDEN]

Database Engine Internals and Implementation (Legacy Umbrella)

Legacy oversized source track retained for migration only. Prefer focused review tracks for storage engines, query execution, transactions, backend database operations, and PostgreSQL operations.

Deep Dive / 47 lessons

Not published

[HIDDEN]

Data Architecture and Platforms

End-to-end data system architecture, platform contracts, and data-intensive operating models.

Deep Dive / 46 lessons

Not published

[HIDDEN]

Search Indexing and Retrieval

Index construction, retrieval models, hybrid search, and crawl-to-index system design.

Deep Dive / 34 lessons

Not published

[HIDDEN]

Ranking, Evaluation, and Search Quality

Learning to rank, experimentation, evaluation pipelines, and search quality governance.

Specialization / 26 lessons

Not published

[HIDDEN]

Web-Scale Data Mining and Recommendation

Approximation, large-scale analytics, graph mining, clustering, and recommender architectures.

Atlas / 62 lessons

Not published

[DRAFT]

Analytical Query Engines and Warehouses

Columnar storage, execution engines, warehouse architecture, vectorization, and the internals of large-scale analytical query systems.

Deep Dive / 32 lessons

Not published

[DRAFT]

Data Integration, CDC, and Pipelines

Change capture, ingestion contracts, backfills, schema drift, and the operational trade-offs of moving data through modern pipelines.

Specialization / 24 lessons

Not published

[DRAFT]

Data Systems Foundations

Data models, storage trade-offs, batch versus streaming, analytical versus transactional systems, and the basic mental models for modern data stacks.

Foundation / 16 lessons

Not published

[DRAFT]

Data Lakehouse and Storage Formats

Draft track for columnar formats, table metadata layers, schema evolution, compaction, and lakehouse architecture.

Specialization / 24 lessons

Not published

[DRAFT]

Metadata, Lineage, and Catalog Systems

Schemas, ownership, lineage graphs, discovery surfaces, and the metadata infrastructure that makes data platforms governable.

Specialization / 24 lessons

Not published

[DRAFT]

Streaming Data Infrastructure

Streaming ingestion, stateful processors, watermarks, checkpoints, exactly-once claims, backpressure, replay, and the platform patterns behind low-latency data movement.

Specialization / 24 lessons

Not published

[DRAFT]

Knowledge Graphs and Entity Resolution

Entity linking, graph modeling, canonicalization, and the data structures used to connect knowledge across noisy sources.

Specialization / 24 lessons

Not published

[DRAFT]

Large-Scale Data Mining

Approximation, sketching, graph mining, clustering, and large-scale analytical pipelines without the recommendation stack mixed in.

Deep Dive / 32 lessons

Not published

[DRAFT]

Ads, Auctions, and Marketplace Ranking

Auction design, bidding signals, marketplace objectives, and the ranking trade-offs unique to monetized retrieval systems.

Deep Dive / 32 lessons

Not published

[DRAFT]

Query Understanding and Semantic Retrieval

Intent parsing, reformulation, semantic matching, and the retrieval improvements that start from better representations of user needs.

Specialization / 24 lessons

Not published

[DRAFT]

Experimentation and Online Learning for Ranking

Interleaving, A/B testing, bandits, feedback loops, and the online methods used to improve ranking systems safely.

Specialization / 24 lessons

Not published

[DRAFT]

Recommendation and Personalization Systems

Candidate generation, ranking stacks, feedback loops, experimentation, and product-serving architectures for personalization systems.

Deep Dive / 32 lessons

Not published

[DRAFT]

Retrieval and Ranking Foundations

Indexing basics, ranking intuition, query-document matching, and the introductory mental models behind search and recommendation quality.

Foundation / 16 lessons

Not published

[DRAFT]

Vector Search and Embedding Systems

Draft track for embeddings, ANN indexes, hybrid retrieval, vector databases, and retrieval serving trade-offs.

Specialization / 32 lessons

Not published

[DRAFT]

In-Memory Data Systems and Redis

In-memory system design through Redis as the concrete case study: event loops, data structures, persistence, replication, clustering, caching, queues, locks, and operations.

Specialization / 24 lessons

Not published

[DRAFT]

NoSQL and Distributed Data Stores

Key-value, document, wide-column, graph, and search-oriented data stores with partitioning, replication, consistency, compaction, indexing, and operations.

Specialization / 24 lessons

Not published

[DRAFT]

Search Engine Serving and Operations

Production search serving: schemas, analyzers, shards, query execution, aggregations, relevance tuning, hybrid search, indexing pipelines, cluster operations, and incidents.

Specialization / 24 lessons

Not published

[DRAFT]

Backend Database Operations and Query Performance

Operational database depth for backend engineers: connection pools, isolation, query planning, index health, sharding, replicas, failover, and split-brain prevention.

Deep Dive / 24 lessons

Not published

[DRAFT]

PostgreSQL Internals and Operations

PostgreSQL-specific depth for production systems: MVCC, WAL, locks, planner evidence, indexes, vacuum, replication, pooling, migrations, security, and operational debugging.

Deep Dive / 24 lessons

Not published