Latest Articles
Thoughts on tech, data, policy, and philosophy.
Writing to an Apache Iceberg Table: How Commits and ACID Actually Work
This is Part 6 of a 15-part Apache Iceberg Masterclass. Part 5 covered hidden partitioning....
Data Virtualization and the Semantic Layer: Query Without Copying
Every data pipeline you build to move data from one system to another costs you three things: time to build it, money to run it, and freshness you los...
Data Virtualization and the Semantic Layer: Query Without Copying
The Direction of AI in 2026: Performance, Cost, and the End of One Model for Everything
The Direction of AI in 2026: Performance, Cost, and the End of One Model for Everything
Six months ago, I could tell you which model to use for almost any job, and I would have said it with confidence....
The Remote Already Exists: What “Click” Got Right About Agentic AI
The Remote Already Exists: What "Click" Got Right About Agentic AI
I rewatched “Click” recently, the 2006 Adam Sandler movie that everyone remembers as a dumb comedy and almost nobody remembers as the quiet tragedy it...
Apache Data Lakehouse Weekly: June 4 to June 11, 2026
The lakehouse community spent this week arguing about versions, and the arguments mattered....
The Fragile Trillion: What Elon Musk’s Net Worth Really Tells Us About Wealth, Spending Power, and World Hunger
On June 12, 2026, SpaceX began trading on the Nasdaq under the ticker SPCX....
Apple Goes Agentic: AI Week of June 4-11, 2026
Apple rebuilt its developer stack around AI agents at WWDC 2026 this week....
Apple Goes Agentic: AI Week of June 4–11, 2026
Apache Data Lakehouse Weekly: June 4 to June 11, 2026
The Best Data Lakehouse Tools for Apache Iceberg in 2026: A Complete Breakdown
By Alex Merced, Head of Developer Relations at Dremio and author of books on Apache Iceberg, Apache Polaris, and data lakehouses....
The Best Data Lakehouse Tools for Apache Iceberg in 2026: A Complete Breakdown
Love Is Not Soft: Why Kindness, Forgiveness, and Empathy Are the Foundation of a Free Society
TL;DR Liberty is not self-sustaining....
Apache Iceberg v4: The Current State, the Proposals, and Why They Matter
Apache Iceberg v4: The Current State, the Proposals, and Why They Matter
Hidden Partitioning: How Iceberg Eliminates Accidental Full Table Scans
This is Part 5 of a 15-part Apache Iceberg Masterclass. Part 4 covered partition evolution....
The Role of the Semantic Layer in Data Governance
Most organizations have a data governance policy....
The Role of the Semantic Layer in Data Governance
The State of Apache Iceberg Catalogs in June 2026
The table format question is settled....
The State of Apache Iceberg Catalogs in June 2026
The Complete Guide to Agentic Coding Tools in 2026
Agentic Lakehouse Concurrency and Isolation
Anatomy of an Agentic Lakehouse
Apache Iceberg v4 Roadmap: Adaptive Metadata Trees, Single-File Commits, and the Delta Convergence
Lakehouse Context Layers with Atlan and Iceberg v3
Real-Time Agentic Analytics with ClickHouse
Composable Analytics Beats Metric Catalogs
Goal-Directed Analytics Agents on Apache Iceberg
Iceberg Remote Signing for Regulated Datasets
Apache Iceberg v3 Deletion Vectors on Snowflake
CDC Without Complexity Using Iceberg v3 Row Lineage
The 2026 Guide to Iceberg View Federation
Implementing MCP in the Lakehouse
Microsoft Fabric Build 2026 Agentic Analytics Stack
Modern Python Tooling for Apache Iceberg
REST Catalog Credential Vending for Lakehouse Security
SaaS Buyers Now Inspect Your Semantic Layer
Securing Agent Identities in the Lakehouse
Bidirectional Iceberg Writes with Horizon Catalog
Semantic View Autopilot in Snowflake Semantic Studio
Zero-Copy Mirroring for Modern Lakehouse Migration
Why Dremio’s Value Is Unique to Apache Iceberg Lakehouses and Agentic Analytics
Why Dremio's Value Is Unique to Apache Iceberg Lakehouses and Agentic Analytics
Most data teams have already made two decisions, even if they haven’t written them down yet....
Apache Data Lakehouse Weekly: May 28 - June 4, 2026
The lakehouse projects spent this week doing two things at once....
AI Weekly: New PC Chips, Credit Pricing, Stateless MCP
Week of May 28 to June 4, 2026...
The Price of Protection: Tariffs, Trade Wars, and Who Really Pays
TL;DR A tariff is a tax....
Markets and Profits Are About More Than Efficiency
The Loveatarian Newsletter...
Partition Evolution: Change Your Partitioning Without Rewriting Data
This is Part 4 of a 15-part Apache Iceberg Masterclass. Part 3 covered metadata-driven performance....
Why Your AI Initiatives Fail Without a Semantic Layer
Your team builds an AI agent....
What AI Is and Isnt: A Laypersons Guide to How LLMs Actually Work
Getting Started with AI for Free: Every Tool Google Gives You at No Cost
ChatGPT and Claude: Which AI Service Should You Pay For
A Tour of Specialized AI Tools: Music, Video, Images, and More
Going Advanced: Open Source Models, Hermes Agent, and Local AI
Data Platform Native AI Agent Tooling in 2026
The Complete Guide to Agentic Coding Tools in 2026
Corporate Welfare: The Socialism Nobody Talks About
TL;DR The United States spends approximately $181 billion per year in direct federal subsidies, grants, tax expenditures, and other financial support ...
Apache Data Lakehouse Weekly: May 21-27, 2026
The week after a major release tends to look quiet on a project’s dev list....
AI Weekly: Cheaper Coding Models, Custom Chips, and a Stateless MCP
The past week pushed three quiet shifts into the open....
Active Monitoring: How Agentic AI Auto-Heals and Protects Enterprise Data Pipelines
Anatomy of an Agentic Analytics System: Inside the Multi-Step Reasoning Loop
Mastering Apache Iceberg v3: What's New and How to Plan Your Upgrade
Building a Custom Agentic Analytics System: Python, LangChain, and SQL Data Lakes
The Death of the Data Swamp: Establishing Governance in Your 2026 Data Lakehouse
How Apache Iceberg Resolves the Hybrid-Cloud Challenge in Heavily Regulated Markets
Securing Apache Iceberg Tables with Fine-Grained Row and Column Level Access Control
Designing an Immutable Data Lakehouse: Best Practices for Iceberg Snapshot Expiration
Decoupling Storage and Compute in Apache Iceberg: A Deep Dive into Cost Optimization
Legacy Warehouses to Open Lakehouses: A Step-by-Step Migration Playbook
Building the Brain of the Agentic Lakehouse: Designing an Open Catalog Architecture
Evaluating the TCO of an Open Lakehouse vs. Proprietary Data Warehouses
Real-Time BI: Enabling Sub-Second Queries on Apache Iceberg Data Lakehouses
The Rise of Agentic Analytics: Shifting BI from Passive Dashboards to Goal-Directed Action
The Semantic Layer as a Translation Engine: Bridging Natural Language and SQL
Comparing the Top 2026 Agentic Analytics Tools: ThoughtSpot, Databricks, and Tableau
Trustworthy AI in the Agentic Lakehouse: Reconciling Concurrency and Isolation Contracts
The 2026 Unified Data Architecture: Reconciling Multi-Cloud Data Lakehouses
Why Traditional Lakehouses Fail AI Agents: The Mathematical Case for the Agentic Lakehouse
The Era of Zero-ETL Federation: Fueling AI Agents with Real-Time Cross-Enterprise Data
Single-Node Data Engineering: DuckDB, DataFusion, Polars, and LakeSail
For the past decade, data engineering was synonymous with distributed clusters....
Iran and a History of Unintended Consequences from Foreign Intervention
TL;DR The current crisis with Iran is not an isolated event....
Performance and Apache Iceberg's Metadata
Performance and Apache Iceberg's Metadata...
Use Hermes Agent for Free With DeepSeek V4 and Slack
Automating Table Maintenance Before Small Files Accumulate
Learn how Databricks Predictive Optimization, AWS S3 Tables, and Iceberg native actions automate compaction and snapshot management before small files...
Choosing the Right Iceberg Control Plane: Polaris vs. Unity Catalog vs. Cloud REST
Choosing an Apache Iceberg catalog? Compare open-source Apache Polaris, open Unity Catalog, and managed cloud REST control planes to unify your lakeho...
Clean Rooms for Privacy-Preserving Analytics
Data clean rooms enable secure multi-party analytics without sharing raw data. Learn how Databricks Clean Rooms, AWS Clean Rooms, and BigQuery differe...
Building Composable Query Engines with Rust Runtimes
Apache DataFusion, Velox, and Substrait form the foundation of modern composable query engine stacks. Learn how these components fit together and when...
Data Mesh After the Hype: What Actually Works
Three years after Zhamak Dehghani's original papers, data mesh has proven valuable in specific organizational contexts and impractical in others. Here...
How dbt Fusion Reshapes Analytics Engineering
dbt Fusion entered public beta in May 2025 with a Rust-powered runtime that changes how analytics engineers develop, validate, and deploy SQL models. ...
Using DuckDB and Polars to Query Iceberg Tables
DuckDB 1.4 LTS and Polars streaming engine now both support reading and writing Apache Iceberg tables. Learn how to use them for lakehouse analytics i...
FinOps for Data Warehouses with Open Billing Data
The FOCUS 1.3 specification and native warehouse cost views make real-time cost attribution practical. Learn how to build a FinOps pipeline for Snowfl...
Designing Governed RAG on Data Products
Enterprise RAG architecture that trusts its own data requires governance at the retrieval layer. Learn how to build governed RAG using data products, ...
What Iceberg V3 Advances Mean for CDC Pipelines
Apache Iceberg V3 brings deletion vectors and row lineage that reshape CDC pipeline design. Learn what these features mean for your streaming data arc...
Kafka 4.0 Changes Streaming Platform Operations
Kafka 4.0 removes ZooKeeper and ships KRaft and KIP-848 by default. Learn what those changes mean for platform operations, upgrades, and client config...
Lance and Iceberg for Multimodal AI Data
LanceDB and Apache Iceberg serve complementary roles in a multimodal AI lakehouse. Learn when to use Lance for embeddings and random access, and Icebe...
Bringing MLflow and Data Pipelines Closer Together
MLflow 3 extends observability from classic ML experiments to GenAI tracing and data pipeline lineage. Learn how to connect data quality monitoring wi...
Modern Feature Stores Beyond Batch Pipelines
Feature stores like Feast now support streaming feature views from Kafka and Kinesis alongside batch pipelines. Learn how to build real-time features ...
OpenLineage as the Spine of Data Observability
OpenLineage provides a standard API for collecting pipeline lineage across Airflow, Spark, Flink, and dbt. Learn how it powers blast radius analysis a...
When Paimon Beats Iceberg for Mutable Streams
Apache Paimon uses LSM-Tree storage for native CDC upserts without restart. Learn when Paimon outperforms Iceberg for high-churn mutable streaming wor...
Policy as Code for Lakehouse Governance
OPA, ABAC, row filters, and column masks make lakehouse governance programmable and scalable. Learn how Databricks, Snowflake Horizon, and BigQuery im...
Real-Time Lakehouse Patterns with Apache Flink and Iceberg
Learn how to build a real-time lakehouse with Apache Flink 2.1 and the Dynamic Iceberg Sink, covering schema evolution, exactly-once delivery, and com...
Why Semantic Layers Make Enterprise Text-to-SQL Safer
Text-to-SQL accuracy jumps from 40% to 85-95% when grounded in a semantic layer. Learn how Dremio, Snowflake Cortex Analyst, and dbt Semantic Layer im...
Single-Node Data Engineering: DuckDB, DataFusion, Polars, and LakeSail
Optimize single-node data engineering with DuckDB, DataFusion, Polars, and LakeSail. Compare architectures and learn when to transition to Dremio MPP....
Choosing Vector Stores for Retrieval Workloads
pgvector, Milvus, Weaviate, and LanceDB each make different tradeoffs on index type, hybrid search, scale, and operational complexity. Learn which fit...
An In-Depth Overview of the Apache Iceberg 1.11.0 Release
Apache Iceberg 1.11.0 delivers manifest list encryption, the new pluggable File Format API, credential lifecycle refreshes, and Spark/Flink improvemen...
Concurrency, Isolation, and MVCC: How Engines Handle Contention
Hash, Sort-Merge, Broadcast: How Distributed Joins Work
Partitioning, Sharding, and Data Distribution Strategies
Buffer Pools, Caches, and the Memory Hierarchy
Volcano, Vectorized, Compiled: How Engines Execute Your Query
Inside the Query Optimizer: How Engines Pick a Plan
B-Trees, LSM Trees, and the Indexing Tradeoff Spectrum
How Databases Organize Data on Disk: Pages, Blocks, and File Formats
Row vs. Column: How Storage Layout Shapes Everything
How Query Engines Think: The Tradeoffs Behind Every Data System
Migrating to Apache Iceberg: Strategies for Every Source System
Hands-On with Apache Iceberg Using Dremio Cloud
Approaches to Streaming Data into Apache Iceberg Tables
Using Apache Iceberg with Python and MPP Query Engines
Apache Iceberg Metadata Tables: Querying the Internals
Maintaining Apache Iceberg Tables: Compaction, Expiry, and Cleanup
How Data Lake Table Storage Degrades Over Time
When Catalogs Are Embedded in Storage
What Are Lakehouse Catalogs? The Role of Catalogs in Apache Iceberg
Writing to an Apache Iceberg Table: How Commits and ACID Actually Work
Hidden Partitioning: How Iceberg Eliminates Accidental Full Table Scans
Partition Evolution: Change Your Partitioning Without Rewriting Data
Performance and Apache Iceberg's Metadata
The Metadata Structure of Modern Table Formats
What Are Table Formats and Why Were They Needed?
Who Owns What Happens Inside Your Head?
Two political stories crossed the wires this past week....
War, Spyware, and the Pendulum Swing: What This Week Tells Us About Power
This week gave us a lot to sit with....
When the War Vote Never Happens: A Lovatarian Look at Iran
The United States has been at war with Iran for three weeks....
War and the Cost of Coercion
The Lovatarian | March 10, 2026...
Agentic Analytics on the Apache Lakehouse
Read the complete Open Source and the Lakehouse series:......
Apache Software Foundation: History, Purpose, and Process
Read the complete Open Source and the Lakehouse series:......
Assembling the Apache Lakehouse: The Modular Architecture
Read the complete Open Source and the Lakehouse series:......
Context Management Strategies for ChatGPT: A Complete Guide to Getting Better Results
Getting consistently useful results from ChatGPT requires more than writing good prompts. The real differentiator is how you manage context: the backg...
Context Management Strategies for Claude Code: A Complete Guide for Developers
Claude Code is a terminal-native agentic coding assistant that lives in your command line and operates directly on your codebase. Unlike chat-based in...
Context Management Strategies for Claude CoWork: A Complete Guide for Knowledge Workers
Claude CoWork represents a fundamentally different approach to AI context management. Unlike chat interfaces where you send messages and receive respo...
Context Management Strategies for Claude Desktop: A Complete Guide to MCP, Computer Use, and Local File Access
Claude Desktop takes everything available in Claude Web and adds three capabilities that fundamentally change how you manage context: MCP server conne...
Context Management Strategies for Claude Web: A Complete Guide to Projects, Artifacts, and Intelligent Context
Claude's web interface at claude.ai combines one of the largest context windows in the industry with a structured Project system that makes it genuine...
Context Management Strategies for Cursor: A Complete Guide to the AI-Native Code Editor
Cursor is an AI-native code editor built on the VS Code foundation that integrates AI deeply into every aspect of the development workflow. Its contex...
Context Management Strategies for Gemini CLI: A Complete Guide to Terminal-Native AI Development
Gemini CLI is an open-source terminal agent powered by Gemini models that operates directly in your command line. It brings Google's AI capabilities i...
Context Management Strategies for Gemini Web and NotebookLM: A Complete Guide to Google's AI Knowledge Ecosystem
Google's AI ecosystem for knowledge work consists of two deeply integrated tools: Gemini (the conversational AI at gemini.google.com) and NotebookLM (...
Context Management Strategies for Google Antigravity: A Complete Guide to the Agent-First IDE
Google Antigravity is an agent-first IDE built by Google DeepMind's Advanced Agentic Coding team. It approaches context management differently from ot...
Context Management Strategies for OpenAI Codex: A Complete Guide Across Browser, CLI, and App
OpenAI Codex is not a chatbot. It is an autonomous software engineering agent that runs tasks in isolated cloud sandboxes, operates across a browser i...
Context Management Strategies for OpenCode: A Complete Guide to the Open-Source Terminal AI Agent
OpenCode is an open-source terminal-based AI coding agent that prioritizes privacy, local-first operation, and broad model provider support. Built as ...
Context Management Strategies for OpenWork: A Complete Guide to the Desktop AI Agent Framework
OpenWork is a desktop-native AI agent framework designed for local, multi-step task execution on your computer. Unlike browser-based AI tools or termi...
Context Management Strategies for Perplexity AI: A Complete Guide to Research-First AI Conversations
Perplexity AI occupies a unique position in the AI landscape: it is a research-first tool that combines conversational AI with real-time web search to...
Context Management Strategies for T3 Chat: A Complete Guide to the Unified Multi-Model AI Interface
T3 Chat is a modern web-based AI chat interface that gives you access to multiple AI models through a single unified platform. Its primary value propo...
Context Management Strategies for VS Code with LLM Plugins: A Complete Guide to Building Your Own AI-Powered IDE
Visual Studio Code is the most widely used code editor in the world, and its extensibility means you can integrate AI capabilities through a growing e...
Context Management Strategies for Windsurf: A Complete Guide to the AI Flow IDE
Windsurf is an AI-powered IDE built on the VS Code foundation that introduces the concept of "Flows," a paradigm where the AI maintains deep awareness...
Context Management Strategies for Zed: A Complete Guide to the High-Performance AI Code Editor
Zed is a high-performance code editor built in Rust that prioritizes speed, simplicity, and real-time collaboration. Its AI integration is designed to...
The Model Context Protocol (MCP) Explained: A Complete Guide to How Every Major AI Tool Connects to External Data
The Model Context Protocol (MCP) has become the universal standard for connecting AI models to external tools, data sources, and services. Originally ...
What is Apache Arrow? Erasing the Serialization Tax
Read the complete Open Source and the Lakehouse series:......
What is Apache Iceberg? The Table Format Revolution
Read the complete Open Source and the Lakehouse series:......
What is Apache Parquet? Columns, Encoding, and Performance
Read the complete Open Source and the Lakehouse series:......
What is Apache Polaris? Unifying the Iceberg Ecosystem
Read the complete Open Source and the Lakehouse series:......
How to Use Dremio with Amazon Kiro: Connect, Query, and Build Data Apps
Amazon Kiro is an agentic AI IDE from AWS that introduces spec-driven development to the coding workflow. Instead of jumping straight to code, Kiro he...
How to Use Dremio with Claude Code: Connect, Query, and Build Data Apps
Claude Code is Anthropic's terminal-based coding agent. It reads your files, writes code, runs commands, and maintains context across a session. Dremi...
How to Use Dremio with Claude CoWork: Connect, Query, and Build Data Apps
Claude CoWork is Anthropic's desktop agentic assistant. Unlike Claude Code (a terminal coding agent), CoWork operates as a general-purpose autonomous ...
How to Use Dremio with Cursor: Connect, Query, and Build Data Apps
Cursor is an AI-native code editor built as a fork of VS Code. It integrates AI directly into the editing experience with features like Chat, Composer...
How to Use Dremio with Gemini CLI: Connect, Query, and Build Data Apps
Gemini CLI is Google's open-source terminal-based AI agent. It runs directly in your terminal, powered by Gemini models with a 1-million token context...
How to Use Dremio with GitHub Copilot: Connect, Query, and Build Data Apps
GitHub Copilot is the most widely adopted AI coding assistant, integrated into VS Code, JetBrains IDEs, and the GitHub platform. Its agent mode allows...
How to Use Dremio with Google Antigravity: Connect, Query, and Build Data Apps
Google Antigravity is an agent-first IDE built by Google DeepMind. Its autonomous agents plan multi-step tasks, write code, browse documentation, and ...
How to Use Dremio with JetBrains AI Assistant: Connect, Query, and Build Data Apps
JetBrains AI Assistant is built into IntelliJ IDEA, PyCharm, DataGrip, and every JetBrains IDE. It provides AI chat, inline code generation, multi-fil...
How to Use Dremio with OpenAI Codex CLI: Connect, Query, and Build Data Apps
OpenAI Codex CLI is a terminal-based coding agent built in Rust. It reads your codebase, writes files, executes commands, and supports MCP for connect...
How to Use Dremio with OpenCode: Connect, Query, and Build Data Apps
OpenCode is an open-source, terminal-based AI coding agent released under the MIT license. It provides a TUI with split panes, uses the Language Serve...
How to Use Dremio with OpenWork: Connect, Query, and Build Data Apps
OpenWork is an open-source desktop AI agent built on the OpenCode engine. It runs entirely on your machine with your own API keys, giving you full con...
How to Use Dremio with Windsurf: Connect, Query, and Build Data Apps
Windsurf is an AI-native code editor built as a fork of VS Code. Its standout feature is Cascade, an agentic AI system that plans and executes multi-s...
How to Use Dremio with Zed: Connect, Query, and Build Data Apps
Zed is an open-source, GPU-accelerated code editor written in Rust. It is designed for speed and collaboration, with a built-in AI assistant that supp...
Classify Your Data with SQL: A Hands-On Guide to Dremio's AI_CLASSIFY Function
Most classification workflows require exporting data to Python, running a model, and importing results back into your warehouse. Dremio's AI_CLASSIFY ...
Connect Amazon Redshift to Dremio Cloud: Extend Your Warehouse with Federation and AI Analytics
Amazon Redshift is AWS's managed data warehouse, designed for petabyte-scale analytics. If your organization chose Redshift for analytical workloads, ...
Connect Amazon S3 to Dremio Cloud: Query Your Data Lake with SQL, Federation, and AI
Amazon S3 is the default landing zone for data in the cloud. Log files, Parquet datasets, CSV exports, JSON events, IoT telemetry, and raw data dumps ...
Connect Any Iceberg REST Catalog to Dremio Cloud: Universal Lakehouse Access
The Apache Iceberg REST Catalog specification defines a standard HTTP API for managing Iceberg table metadata. Any catalog implementation that conform...
Connect Apache Druid to Dremio Cloud: Add SQL Joins, AI, and Governance to Your Real-Time Analytics
Apache Druid is a real-time analytics database designed for sub-second queries on high-ingestion-rate event data. Clickstream analytics, application m...
Connect AWS Glue Data Catalog to Dremio Cloud: Query and Manage Your AWS Iceberg Tables
AWS Glue Data Catalog is AWS's managed metadata service for data lakes. It stores table definitions, schemas, partition information, and statistics fo...
Connect Azure Storage to Dremio Cloud: Query Your Microsoft Data Lake with SQL and AI
Azure Storage is Microsoft's cloud storage platform, spanning Blob Storage, Azure Data Lake Storage Gen2 (ADLS Gen2), and Azure Files. If your organiz...
Connect Azure Synapse Analytics to Dremio Cloud: Multi-Cloud Data Warehouse Federation
Microsoft Azure Synapse Analytics combines big data analytics and enterprise data warehousing into a single Azure-integrated platform. If your organiz...
Connect Databricks Unity Catalog to Dremio Cloud: Query Delta Lake Tables with Federation and AI
Databricks Unity Catalog is Databricks' governance layer for data and AI assets. It manages Delta Lake tables, machine learning models, feature stores...
Connect Dremio Software to Dremio Cloud: Hybrid Federation Across Deployments
Dremio Cloud can connect to Dremio Software (self-managed) instances as a federated data source. This creates a hybrid deployment where Dremio Cloud s...
Connect Google BigQuery to Dremio Cloud: Cross-Cloud Analytics Without Data Movement
Google BigQuery is Google Cloud's serverless data warehouse. If your organization uses Google Cloud Platform, BigQuery is where your analytics data, m...
Connect IBM Db2 to Dremio Cloud: Modernize Mainframe Analytics with Federation and AI
IBM Db2 is the relational database that powers critical applications across banking, insurance, government, healthcare, and manufacturing. For organiz...
Connect Microsoft SQL Server to Dremio Cloud: Federate Enterprise Data Without ETL
Microsoft SQL Server is one of the most widely deployed enterprise databases in the world. ERP systems, CRM platforms, financial applications, and cus...
Connect MongoDB to Dremio Cloud: SQL Analytics on Document Data
MongoDB is the most popular NoSQL document database. It stores data in flexible JSON-like documents, making it ideal for applications with evolving sc...
Connect MySQL to Dremio Cloud: Federated Analytics Without ETL
MySQL runs more web applications, SaaS platforms, and e-commerce backends than any other database. It's fast for transactional reads and writes, but i...
Connect Oracle Database to Dremio Cloud: Enterprise Analytics Without Data Movement
Oracle Database runs the most critical enterprise applications in the world - ERP systems, financial ledgers, supply chain management, and HR platfo...
Connect PostgreSQL to Dremio Cloud: Query, Federate, and Accelerate Your Data
PostgreSQL powers more production applications than almost any other open-source database. It's where your customer records, transaction logs, product...
Connect SAP HANA to Dremio Cloud: Unlock Analytics Beyond the SAP Ecosystem
SAP HANA is the in-memory database platform that powers SAP S/4HANA, SAP BW/4HANA, and custom enterprise applications across finance, manufacturing, l...
Connect Snowflake Open Catalog to Dremio Cloud: Multi-Engine Iceberg Analytics
Snowflake Open Catalog is Snowflake's managed implementation of the Apache Iceberg REST catalog specification, based on the open-source Apache Polaris...
Connect Snowflake to Dremio Cloud: Federate, Govern, and Accelerate Beyond Snowflake
Snowflake is a popular cloud data warehouse known for its separation of storage and compute, near-zero maintenance, and broad ecosystem. Many organiza...
Connect Vertica to Dremio Cloud: Federation for Analytics-Optimized Data
Vertica is a columnar analytics database engineered for fast aggregate queries on large datasets. It was built from the ground up for analytical workl...
Dremio's Built-in Open Catalog: Your Zero-Configuration Apache Iceberg Lakehouse
Every Dremio Cloud account starts with a built-in Open Catalog : a fully managed Apache Iceberg catalog with integrated storage. When you create a Dre...
Extract Structured Data from Text with Dremio's AI_GENERATE Function
Unstructured text is the most underused data in most organizations. Customer emails sit in inboxes. Contract notes live in text fields. Meeting summar...
Generate Summaries and Insights with Dremio's AI_COMPLETE Function
Every data team has a version of this problem: a table full of raw data that needs human-readable summaries, translations, or narrative descriptions. ...
War on Two Fronts: How Tariffs and Iran Show the Cost of Political Power
This week, the United States government gave the country two vivid demonstrations of what happens when political power runs without limits....
The Government That Would Not Stop Itself
America is living through a story it has told itself before....
Batch vs. Streaming: Choose the Right Processing Model
"We need real-time data." This is one of the most expensive sentences in data engineering : because it's rarely true, and implementing it when it's no...
Conceptual, Logical, and Physical Data Models Explained
Most data teams jump straight from a stakeholder request to creating database tables. They skip the planning steps that prevent misalignment, redundan...
Data Engineering Best Practices: The Complete Checklist
Best practices documents are easy to write and hard to use. They list principles without context, advice without prioritization, and rules without exp...
Data Modeling Best Practices: 7 Mistakes to Avoid
A bad data model doesn't announce itself. It hides behind slow dashboards, conflicting numbers, confused analysts, and AI agents that generate wrong S...
Data Modeling for Analytics: Optimize for Queries, Not Transactions
The data model that runs your production application is almost never the right model for analytics. Transactional systems are designed for fast writes...
Data Modeling for the Lakehouse: What Changes
Traditional data modeling assumed you controlled the database. You defined schemas up front, enforced foreign keys at write time, and optimized with i...
Data Quality Is a Pipeline Problem, Not a Dashboard Problem
When an analyst finds null values in a revenue column, the typical response is to add a calculated field in the BI tool: IF revenue IS NULL THEN 0. Th...
Data Vault Modeling: Hubs, Links, and Satellites
Dimensional modeling works well when your source systems are stable and your business questions are predictable. But what happens when sources change ...
Data Virtualization and the Semantic Layer: Query Without Copying
Every data pipeline you build to move data from one system to another costs you three things: time to build it, money to run it, and freshness you los...
Denormalization: When and Why to Flatten Your Data
Normalization is the first rule taught in database design. Eliminate redundancy. Store each fact once. Use foreign keys. It's the right rule for trans...
Dimensional Modeling: Facts, Dimensions, and Grains
Dimensional modeling is the most widely used approach for organizing analytics data. Developed by Ralph Kimball, it structures data into two types of ...
Headless BI: How a Universal Semantic Layer Replaces Tool-Specific Models
Your organization uses Tableau for executive dashboards, Power BI for operational reports, and Python notebooks for data science. Revenue is defined i...
How a Self-Documenting Semantic Layer Reduces Data Team Toil
Every data team knows documentation is important. And almost every data team has a backlog of undocumented tables, unlabeled columns, and outdated des...
How to Build a Semantic Layer: A Step-by-Step Guide
Most teams start building a semantic layer the wrong way: they open their BI tool, create a few calculated fields, and call it done. Six months later,...
How to Design Reliable Data Pipelines
Most pipeline failures aren't caused by bad code. They're caused by no architecture. A script that reads from an API, transforms JSON, and writes to a...
How to Think Like a Data Engineer
The median lifespan of a popular data tool is about three years. The tool you master today may be deprecated or replaced by the time your next project...
Idempotent Pipelines: Build Once, Run Safely Forever
A pipeline runs, processes 100,000 records, and loads them into the target table. Then it fails on a downstream step. The orchestrator retries the ent...
Partition and Organize Data for Performance
A table with 500 million rows takes 45 seconds to query. After partitioning it by date, the same query : filtering on a single day, returns in 2 seco...
Pipeline Observability: Know When Things Break
An analyst messages you on Slack: "The revenue numbers look wrong. Is the pipeline broken?" You check the orchestrator : all green. You check the targ...
Schema Evolution Without Breaking Consumers
A source team renames a column from user_id to customer_id. Twelve hours later, five dashboards show blank values, two ML pipelines fail, and the data...
Semantic Layer Best Practices: 7 Mistakes to Avoid
Semantic layers don't fail because the technology is wrong. They fail because of design decisions made in the first two weeks : choices that seem reas...
Semantic Layer vs. Data Catalog: Complementary, Not Competing
"We already have a data catalog, so we don't need a semantic layer." This is one of the most common misconceptions in modern data architecture. Catalo...
Semantic Layer vs. Metrics Layer: What's the Difference?
Both terms appear in every modern data architecture diagram. They're used interchangeably in conference talks, Slack threads, and vendor marketing. An...
Slowly Changing Dimensions: Types 1-3 with Examples
Dimensions change. A customer moves cities. A product gets reclassified. An employee changes departments. How your data model handles these changes de...
Star Schema vs. Snowflake Schema: When to Use Each
Both star schemas and snowflake schemas are dimensional models. They both organize data into fact tables (measurable events) and dimension tables (con...
Testing Data Pipelines: What to Validate and When
Ask an application developer how they test their code and they'll describe unit tests, integration tests, CI/CD pipelines, and coverage metrics. Ask a...
The Role of the Semantic Layer in Data Governance
Most organizations have a data governance policy. It lives in a Confluence page. It defines who owns what data, what terms mean, and who should have a...
What Is a Semantic Layer? A Complete Guide
Ask three teams in your company how they calculate "revenue" and you'll get three answers. Sales counts bookings. Finance counts recognized revenue. M...
What Is Data Modeling? A Complete Guide
Every database, data warehouse, and data lakehouse starts with the same question: how should this data be organized? Data modeling answers that questi...
Why Your AI Initiatives Fail Without a Semantic Layer
Your team builds an AI agent. It connects to your data warehouse. A product manager types "What was revenue last quarter?" and gets a number. The numb...
A 2026 Introduction to Apache Iceberg
An updated introduction to Apache Iceberg...
When Scale Breaks Consent: Lessons From America’s Immigration Crackdown
The Trump administration recently achieved a grim milestone....
Tariffs and Limits: What We Lose in Scaled Solutions
President Trump signed a $1.2 trillion spending package on February 3, 2026, ending a four-day partial government shutdown....
When Force Replaces Feedback: Lessons from Minneapolis
Minneapolis is burning again....
When Critics Become Criminals
Minnesota has become a pressure cooker....
A Practical Guide to AI-Assisted Coding Tools
An in-depth guide to understanding, choosing, and using AI-assisted coding tools effectively....
What Are Recursive Language Models?
Recursive Language Models (RLMs) are a new class of language models that can call themselves to break down complex tasks into manageable parts. This a...
2025 Year in Review Apache Iceberg, Polaris, Parquet, and Arrow
A look back at key developments in Apache Iceberg, Polaris, Parquet, and Arrow in 2025....
dremioframe & iceberg - Pythonic interfaces for Dremio and Apache Iceberg
Discover DremioFrame and IceFrame, two new Python libraries that simplify working with Dremio and Apache Iceberg. Learn how these tools streamline dat...
Introducing dremioframe - A Pythonic DataFrame Interface for Dremio
Discover dremioframe, a new Python library that offers a DataFrame-like experience for interacting with Dremio's data lakehouse platform. Learn how to...
Comprehensive Hands-on Walk Through of Dremio Cloud Next Gen (Hands-on with Free Trial)
Walkthrough with the new trial of the Dremio Cloud Platform...
2025-2026 Guide to Learning about Apache Iceberg, Data Lakehouse & Agentic AI
A curated guide to mastering Apache Iceberg, data lakehouse architectures, and the emerging field of Agentic AI for data professionals....
An Exploration of the Commercial Iceberg Catalog Ecosystem
Dive into the world of commercial Iceberg catalogs and discover how they enhance data lakehouse architectures for modern data engineering....
Building a Universal Lakehouse Catalog - Beyond Iceberg Tables
Exploring paths to a universal lakehouse catalog that supports multiple data formats and engines, building on Apache Iceberg's success....
Intro to Apache Iceberg with Apache Polaris and Apache Spark
Learn how to leverage Apache Iceberg with Apache Polaris and Apache Spark to build scalable and efficient data lakehouses....
The State of Apache Iceberg v4 - October 2025 Edition
What's Coming in Apache Iceberg v4: A Deep Dive into the Future of Open Table Formats...
The Ultimate Guide to Open Table Formats - Iceberg, Delta Lake, Hudi, Paimon, and DuckLake
Understanding Iceberg, Delta Lake, Hudi, Paimon, and DuckLake...
The 2025 & 2026 Ultimate Guide to the Data Lakehouse and the Data Lakehouse Ecosystem
What is the Data Lakehouse and the Data Lakehouse Ecosystem? This comprehensive guide covers everything you need to know about the Data Lakehouse arch...
The Endgame – Building an Autonomous Optimization Pipeline for Apache Iceberg
Learn how to automate compaction, snapshot expiration, and layout optimization in Apache Iceberg using metadata-driven triggers and orchestration tool...
Managing Large-Scale Optimizations – Parallelism, Checkpointing, and Fail Recovery
Learn how to scale Apache Iceberg table optimizations across large datasets using parallelism, checkpointing, and fail recovery to ensure reliability ...
Unlocking the Power of Agentic AI with Apache Iceberg and Dremio
Unlocking the Power of Agentic AI with Apache Iceberg and Dremio...
Hidden Pitfalls – Compaction and Partition Evolution in Apache Iceberg
Partition evolution in Apache Iceberg is a powerful feature, but if not managed carefully, it can introduce fragmentation and impact compaction perfor...
Using Iceberg Metadata Tables to Determine When Compaction Is Needed
Discover how to use Apache Iceberg's metadata tables to proactively detect small files, bloated manifests, and table fragmentation - so you can trigge...
Designing the Ideal Cadence for Compaction and Snapshot Expiration
Learn how to design an effective schedule for compaction and snapshot expiration in Apache Iceberg to balance cost, performance, and data freshness....
Avoiding Metadata Bloat with Snapshot Expiration and Rewriting Manifests
Learn how to prevent and clean up metadata bloat in Apache Iceberg by expiring snapshots and rewriting manifests for better performance and manageabil...
Smarter Data Layout – Sorting and Clustering Iceberg Tables
Improve query performance in Apache Iceberg by organizing your data layout with sorting and Z-order clustering. Learn how to reduce scan cost and impr...
Optimizing Compaction for Streaming Workloads in Apache Iceberg
Learn how to design fast, incremental compaction strategies in Apache Iceberg to support high-throughput streaming pipelines without disrupting freshn...
The Basics of Compaction – Bin Packing Your Data for Efficiency
Learn how standard compaction works in Apache Iceberg and why bin packing your data files is essential for maintaining query performance and cost effi...
The Cost of Neglect – How Apache Iceberg Tables Degrade Without Optimization
Learn how Apache Iceberg tables can degrade over time without optimization and what issues this causes for performance, cost, and governance....
How to Discover or Organize Lakehouse & Apache Iceberg Meetups
Guide on How to Be Part of the Lakehouse Community...
Introduction to Data Engineering Concepts | What is Data Engineering?
Introduction to the terms in data engineering...
Introduction to Data Engineering Concepts | Understanding Data Sources and Ingestion
Introduction to the terms in data engineering...
Introduction to Data Engineering Concepts | ETL vs ELT – Understanding Data Pipelines
Introduction to the terms in data engineering...
Introduction to Data Engineering Concepts | Batch Processing Fundamentals
Introduction to the terms in data engineering...
Introduction to Data Engineering Concepts | Streaming Data Fundamentals
Introduction to the terms in data engineering...
Introduction to Data Engineering Concepts | Data Modeling Basics
Introduction to the terms in data engineering...
Introduction to Data Engineering Concepts | Data Warehousing Fundamentals
Introduction to the terms in data engineering...
Introduction to Data Engineering Concepts | Data Lakes Explained
Introduction to the terms in data engineering...
Introduction to Data Engineering Concepts | Storage Formats and Compression
Introduction to the terms in data engineering...
Introduction to Data Engineering Concepts | Data Quality and Validation
Introduction to the terms in data engineering...
Introduction to Data Engineering Concepts | Metadata, Lineage, and Governance
Introduction to the terms in data engineering...
Introduction to Data Engineering Concepts | Scheduling and Workflow Orchestration
Introduction to the terms in data engineering...
Introduction to Data Engineering Concepts | Building Scalable Pipelines
Introduction to the terms in data engineering...
Introduction to Data Engineering Concepts | DevOps for Data Engineering
Introduction to the terms in data engineering...
Introduction to Data Engineering Concepts | Cloud Data Platforms and the Modern Stack
Introduction to the terms in data engineering...
Introduction to Data Engineering Concepts | Data Lakehouse Architecture Explained
Introduction to the terms in data engineering...
Introduction to Data Engineering Concepts | Apache Iceberg, Arrow, and Polaris
Introduction to the terms in data engineering...
Introduction to Data Engineering Concepts | The Power of Dremio in the Modern Lakehouse
Introduction to the terms in data engineering...
Tariffs, Interest Rates, and Economic Tension: A Balancing Act
President Donald Trump has dramatically increased tariffs—particularly those targeting Chinese imports—with the stated goal of revitalizing U.S....
A Journey from AI to LLMs and MCP - 10 - Sampling and Prompts in MCP – Making Agent Workflows Smarter and Safer
Sampling and Prompts in MCP – Making Agent Workflows Smarter and Safer...
A Journey from AI to LLMs and MCP - 9 - Tools in MCP – Giving LLMs the Power to Act
Tools in MCP – Giving LLMs the Power to Act...
A Journey from AI to LLMs and MCP - 8 - Resources in MCP – Serving Relevant Data Securely to LLMs
Resources in MCP – Serving Relevant Data Securely to LLMs...
A Journey from AI to LLMs and MCP - 7 - Under the Hood – The Architecture of MCP and Its Core Components
Under the Hood – The Architecture of MCP and Its Core Components...
Journey from AI to LLMs and MCP - 6 - Enter the Model Context Protocol (MCP) – The Interoperability Layer for AI Agents
Enter the Model Context Protocol (MCP) – The Interoperability Layer for AI Agents...
A Journey from AI to LLMs and MCP - 5 - AI Agent Frameworks – Benefits and Limitations
AI Agent Frameworks – Benefits and Limitations...
A Journey from AI to LLMs and MCP - 4 - What Are AI Agents – And Why They're the Future of LLM Applications
What Are AI Agents – And Why They're the Future of LLM Applications...
A Journey from AI to LLMs and MCP - 3 - Boosting LLM Performance – Fine-Tuning, Prompt Engineering, and RAG
Boosting LLM Performance – Fine-Tuning, Prompt Engineering, and RAG...
A Journey from AI to LLMs and MCP - 2 - How LLMs Work – Embeddings, Vectors, and Context Windows
How LLMs Work – Embeddings, Vectors, and Context Windows...
A Journey from AI to LLMs and MCP - 1 - What Is AI and How It Evolved Into LLMs
What Is AI and How It Evolved Into LLMs...
Building a Basic MCP Server with Python
The Basics of Building a Basic MCP Server...
Using Helm with Kubernetes - A Guide to Helm Charts and Their Implementation
A Guide on when to use Helm Charts for Kubernetes Deployment...
Crash Course on Developing AI Applications with LangChain
A guide on building AI applications with LangChain, a framework for developing AI applications powered by Large Language Models (LLMs)....
The Data Lakehouse - The Benefits and Enhancing Implementation
Understanding the value of a lakehouse and how to get that value faster...
2025 Comprehensive Guide to Apache Iceberg
What is Apache Iceberg, How it Works, and Why it Matters!...
Net Worth, Liquid Cash, and the Misconceptions About Wealth
It’s a common refrain in modern political discourse: “So-and-so has a net worth of X....
Boring but Essential: Why Institutions and Rights Should Fade into the Background of Daily Life
Society is built on institutions and rights—foundations that underpin the social, economic, and cultural fabric of human interaction....
When to use Apache Xtable or Delta Lake Uniform for Data Lakehouse Interoperability
A Guide on when to use Apache Xtable or Delta Lake Uniform for Data Lakehouse Interoperability...
RAG Isn’t a Modeling Problem. It’s a Data Engineering Problem.
Why retrieval-augmented generation systems fail in enterprises - and what to do about it....
Building Pangolin - My Holiday Break, an AI IDE, and a Lakehouse Catalog for the Curious
A personal story of how I built Pangolin Catalog over a holiday break using an AI-powered IDE....
2025 Guide to Architecting an Iceberg Lakehouse
A Comprehensive Guide to Building a Data Lakehouse with Apache Iceberg...
10 Future Apache Iceberg Developments to Look forward to in 2025
What is cool about Apache Iceberg's Future...
Deep Dive into Dremio's File-based Auto Ingestion into Apache Iceberg Tables
Auto ingesting data from JSON, CSV, and Parquet files into Apache Iceberg Tables...
The Power of Direct Action: Empowerment Beyond the Ballot Box
Election season is a time of heightened emotions, charged debates, and intense focus on who will hold office....
Intro to SQL using Apache Iceberg and Dremio
Intro to SQL using Apache Iceberg and Dremio...
Dremio, Apache Iceberg and their role in AI-Ready Data
The Role of Dremio and Apache Iceberg in AI-Ready Data...
Introduction to Cargo and cargo.toml
Getting Started with Cargo and cargo.toml...
Leveraging Python's Pattern Matching and Comprehensions for Data Analytics
Using Features like Pattern Matching and Comprehensions for Data Analytics...
Hands-on with Apache Iceberg & Dremio on Your Laptop within 10 Minutes
How to get hands-on with Apache Iceberg...
Data Modeling - Entities and Events
How to Model Events and Entities...
All About Parquet Part 01 - An Introduction
All about the Apache Parquet File Format...
All About Parquet Part 02 - Parquet's Columnar Storage Model
All about the Apache Parquet File Format...
All About Parquet Part 03 - Parquet File Structure | Pages, Row Groups, and Columns
All about the Apache Parquet File Format...
All About Parquet Part 04 - Schema Evolution in Parquet
All about the Apache Parquet File Format...
All About Parquet Part 05 - Compression Techniques in Parquet
All about the Apache Parquet File Format...
All About Parquet Part 06 - Encoding in Parquet | Optimizing for Storage
All about the Apache Parquet File Format...
All About Parquet Part 07 - Metadata in Parquet | Improving Data Efficiency
All about the Apache Parquet File Format...
All About Parquet Part 08 - Reading and Writing Parquet Files in Python
All about the Apache Parquet File Format...
All About Parquet Part 09 - Parquet in Data Lake Architectures
All about the Apache Parquet File Format...
All About Parquet Part 10 - Performance Tuning and Best Practices with Parquet
All about the Apache Parquet File Format...
Orchestrating Airflow DAGs with GitHub Actions - A Lightweight Approach to Data Curation Across Spark, Dremio, and Snowflake
Advanced GitHub Actions for Data Engineering...
A Deep Dive Into GitHub Actions From Software Development to Data Engineering
Learning about GitHub Actions...
A Guide to dbt Macros - Purpose, Benefits, and Usage
Learning about dbt Macros...
Data Lakehouse Roundup 1 - News and Insights on the Lakehouse
What's Going on in the Data Lakehouse Space...
Getting Started with Data Analytics Using PyArrow in Python
Learning to work with PyArrow to run analytics...
Working with Collections in Rust | A Comprehensive Guide
Rust Arrays, Vectors and more!...
What is Three-Tier Data (Bronze, Silver, Gold) and How Dremio Simplifies It
Process Data from Raw to Clean Aggregated Data...
A Brief Guide to the Governance of Apache Iceberg Tables
Controlling Access to your Apache Iceberg Tables...
Exploring Data Operations with PySpark, Pandas, DuckDB, Polars, and DataFusion in a Python Notebook
Learning to work with Python to ingest and query data...
Ultimate Directory of Apache Iceberg Resources
Apache Iceberg Education, Tutorials and more!...
Change Data Capture (CDC) when there is no CDC
Handling Synching Changing Data Across Systems...
Virtualization + Lakehouse + Mesh = Data At Scale
Combining Centralization and Decentralization for Data at Scale...
Deep Dive into Data Apps with Streamlit
Building a Deploying Data Apps Easily...
A Deep Dive into Docker Compose
A Comprehensive Guide to Docker Compose...
In-Depth Guide to Working with Strings in Rust
Strings in Rust...
Getting Started with Rust - A Modern Systems Programming Language
Get Started with Rust...
Hands-on with Apache Iceberg on Your Laptop - Deep Dive with Apache Spark, Nessie, Minio, Dremio, Polars and Seaborn
The Evolving Data Lakehouse World...
Why Data Analysts, Engineers, Architects and Scientists Should Care about Dremio and Apache Iceberg
The Evolving Data Lakehouse World...
5 Trends in the Data Lakehouse Space
The Evolving Data Lakehouse World...
Using the alexmerced/datanotebook Docker Image
Setting up a quick and easy data environment for data science and analytics...
Understanding Apache Iceberg Delete Files
Continuing the Understand Apache Iceberg series, this article delves into the Manifest, a critical component of Apache Iceberg's architecture....
Understanding the Apache Iceberg Manifest
Continuing the Understand Apache Iceberg series, this article delves into the Manifest, a critical component of Apache Iceberg's architecture....
Understanding the Apache Iceberg Manifest List (Snapshot)
Continuing the Understand Apache Iceberg series, this article delves into the Manifest List, a critical component of Apache Iceberg's architecture....
Understanding Apache Iceberg's Metadata.json
The role and content of the metadata.json...
What Apache Iceberg REST Catalog is and isn't
Understanding Iceberg Catalog Interoperability...
ACID Guarantees and Apache Iceberg - Turning Any Storage into a Data Warehouse
What are ACID Guarantees? WHy do they matter?...
Data Lakehouse 101 - The Who, What and Why of Data Lakehouses
The Who, What and Why of Data Lakehouses...
Understanding the Polaris Iceberg Catalog and Its Architecture
Learn about the new open source Iceberg Catalog in Town...
Apache Iceberg Reliability
Why Apache Iceberg Works...
Upcoming Data Talks from Alex Merced (And how to follow)
Come see me talk live at these events...
Databases Deconstructed - The Value of Data Lakehouses and Table Formats
Building up the Data Lakehouse...
Video Course - Basics of Lakehouse Engineering - Apache Iceberg, Nessie, Dremio
Introductory Course to Data Engineering for Apache Iceberg Lakehouses...
Introduction to Sorting Algorithms in JavaScript
Working with Sorting Algorithms in Javascript...
Partitioning with Apache Iceberg - A Deep Dive
Benefits of Apache Iceberg Partition Evolution and Hidden Partitioning...
3 Reasons Data Engineers Should Embrace Apache Iceberg
Benefits of Apache Iceberg...
Running SQL on your Excel Files From Your Laptop with Dremio
How to run SQL on your Excel files easily...
Deep Dive into Functional Programming in Javascript
Currying, Monad and Memos, Oh my!...
A Deep Intro to Apache Iceberg and Resources for Learning More
Learning about Apache Iceberg...
Understanding the Future of Apache Iceberg Catalogs
Java, Rest and the expanding open lakehouse ecosystem...
End-to-End Basic Data Engineering Tutorial (Spark, Dremio, Superset)
Ingesting Data and Building BI Dashboards...
5 Open Source Data Projects You Should Be Following
Apache Iceberg, Apache Arrow, Nessie, Ibis, Substrait...
5 Reasons Dremio is the Ideal Apache Iceberg Lakehouse Platform
Understanding how catalogs work and which one to choose...
The Apache Iceberg Lakehouse - The Great Data Equalizer
Disrupting the Snowflake/Databricks status quo...
10 Reasons to Make Apache Iceberg and Dremio Part of Your Data Lakehouse Strategy
Understanding how catalogs work and which one to choose...
A deep dive into the concept and world of Apache Iceberg Catalogs
Understanding how catalogs work and which one to choose...
The Role of Ontologies in Data Management
What are ontologies and why they matter...
Introduction to ANSI SQL - Understanding the Syntax and Concepts
Learning the Standard SQL Syntax...
An Introduction to Python
An overview of Python for beginners...
What is the Data Lakehouse and the Role of Apache Iceberg, Nessie and Dremio?
Understanding the Value of the Data Lakehouse...
Mastering Git | A Comprehensive Guide to git pull and git push
Having a better understanding of git pull and git push...
Partitioning Practices in Apache Hive and Apache Iceberg
Deep Dive in Data Lake Table Partitioning...
Columnar vs. Row-based Data Structures in OLTP and OLAP Systems
The Fundamentals of Data Systems...
Understanding JavaScript Promises In-Depth
Understanding Javascript Promises and Asynchronous Code...
Introduction to Data Vault Modeling
Understanding the Data Vault Style of Data Warehouse Modeling...
Table Format FUD - Thinking Through the Table Format Conversion (Apache Iceberg, Apache Hudi, Delta Lake)
Understanding how to choose a table format...
Embracing the Future of Data Management - Why Choose Lakehouse, Iceberg, and Dremio?
The Future of Data Platforms...
Open Lakehouse Engineering/Apache Iceberg Lakehouse Engineering - A Directory of Resources
Resources for learning how to Engineer an Open Data Lakehouse...
Nessie - An Alternative to Hive & JDBC for Self-Managed Apache Iceberg Catalogs
Nessie is the only open-source catalog implementation specifically for Apache Iceberg....
Apache Iceberg, Git-Like Catalog Versioning and Data Lakehouse Management - Pillars of a Robust Data Lakehouse Platform
This is where the combined power of Dremio’s Lakehouse Management features and Project Nessie's catalog-level versioning comes into play....
No Code - Convert XLS/CSV files into Parquet with Dremio
Convert XLS/CSV Files without having to write python...
Why Dremio is a must for Apache Iceberg Data Lakehouses
Why is Dremio so useful for Apache Iceberg data lakehouses...
What is HTMX? Why it Matters? and How to use it.
The framework that is getting all the buzz about reducing the javascript you need to write...
OOP Design Patterns in Javascript
An overview of OOP Design Patterns in Spark...
An In-Depth Overview of Open Lakehouse Tech: Apache Iceberg & Nessie
How to effectively learn software development
Overview of the Open Lakehouse: Why Dremio?
How to build a Java Spring JSON API from scratch
How to write a JSON API in Scala with Play from scratch
An Approach to Architecting a Lower Cost, Fast and Self-Service Data Lakehouse
Building Full CRUD Rest API's with Flask & FastAPI using PsychoPG2
Handling Cross-Origin Cookies with ExpressJS
Creating a Local Data Lakehouse using Spark/Minio/Dremio/Nessie
Project Nessie: A Look in the Depths
Overview of File Encryption Algorithms for Everyone
Parquet File Compression for Everyone (zstd, brotli, lz4, gzip, snappy)
Dremio and Modern Data Architecture: Data Lakes, Data Lakehouses and Data Mesh
What is Nessie and Why as a Data Engineer or Architect you should care?
Resources for Learning more about Catalog level versioning with Project Nessie & Dremio Arctic (Rollbacks, Branching, Tagging and Multi-Table Txns)
Overview of the Data Lakehouse, Dremio and Apache Iceberg
Understanding the Cutting Edge of Data Engineering...
Using SimpleRPC with SvelteKit 1.0/Typescript
Easy to use RPC in your SvelteKit Application...
Implementing a GraphQL API with a Solid-Start Application
Using this cutting edge framework with a Graphql api...
Implementing a tRPC API with a Solid-Start Application
Using this cutting edge framework with a tRPC api...
Building a Todo List with Solid Start
Seeing the Power of the latest javascript meta-framework...
Understanding Spark Configurations with Apache Iceberg
How to configure Spark for using Apache Iceberg...
5 Reasons Your Data Lakehouse should Embrace Dremio Cloud
How your data lakehouse can expand what's possible with Dremio Cloud....
Brief Hands on Intro to Apache Iceberg
Engineer a Data Lakehouse with Apache Iceberg...
Web Development Glossary 2022
The words you need to know as a web professional...
Introduction to The World of Data - (OLTP, OLAP, Data Warehouses, Data Lakes and more)
An accessible high-level guide for data and non-data professionals...
Guide to JSON, YAML and TOML
Popular formats for configuration...
Making Multiple API Calls in Javascript
Different Patterns of Making Multiple API Calls...
A 2022 Introduction to SQL
Learning Structured Query Language...
2022 MongooseJS Cheatsheet
Details on working with MongooseJS...
Express/EJS/Mongooose Build from Zero to Deploy
Building A Full Stack Application with ExpressJS...
ExpressJS Cheatsheet 2022
Easy reference for using express...
Express Todo List for Beginners
Creating backend applications with Nodejs...
Web Storage API Part 1 - LocalStorage and SessionStorage
Where to store data in the users browser...
Creating a Markdown Blog in 2022 with Next JS
Creating a Blog and Deploying with NextJS...
Javascript DOM & jQuery Cheatsheet 2022
All the main bits summed up in one place...
Getting Started with Scala 3
Powerful Functional & OOP JVM Language...
Understanding SSH and What it is for
Logging Securely and Conveniently with SSH...
Understanding RPC (tour of API protocols, gRPC nodejs walkthrough, and Apache Arrow Flight)
How to write Markdown and where you can use it...
Creating a Consistent Developer Environment with Docker
Using Docker to create an Environment in PHP, Ruby, Python and more...
Why All Developers Should Master Markdown
How to write Markdown and where you can use it...
Becoming a Developer in 2022
How to switch careers within 12-18 months...
How to create an One to Many Relationship with Auth in Python with Masonite
Using A Developer Friendly Web Framework in Python...
Auth with Express with JWT, MongoDB, and Postgres
For simple web development...
Simple Setup for Application Wide State in React
Sharing State Across Your React App with just React (No Redux or Recoil)...
MongoDB Relationships using Mongoose in NodeJS
Guide to Relating Data...
The Guide to How to Implement Authorization in any language and framework
Having Users Login...
Creating a GraphQL Based Habit Tracker with Hasura and React (GraphQL/Hasura 101)
GraphQL Made Easy...
Developer Team Work Best Practices (Git, Agile/Scrum/Kanban, CI/CD)
How to be part of a developer team...
My First React App - 2021 Intro to React
The Basics of React...
Comparing React Router 5, 6, and React Location
For simple web development...
Express Templating Cheatsheet
Sever Side Rendering for All the People!...
Methods of Starting a Quick HTTP Server from the Command Line (alternatives to liveserver)
For simple web development...
How to use Neo4j Graph Database in your Node Project (Express, Koa, Fastify, etc.)
Graph Databases are Cool...
Building a Full-Stack Todo App with Typescript, NextJS and Mongo - 0 To Deploy
Trying out the hottest framework around...
Pattern Matching in Javascript with alexmerced-patternmatcher
Like a Switch Statement on Steroids...
Walkthrough - Deploy Anything with Nginx
Getting Data from an external API...
Frontend Javascript Ajax/Http Request Guide
Getting Data from an external API...
Basic Authentication with Node/Express and Mongo
A Beginning oriented dive into databases...
The renaissance of server side rendering with Alpine and HTMX, Reactivity with Minimal JS
A Beginning oriented dive into databases...
3 Ways to make API Requests in React (fetch/axios, merced-react-hooks, react-request)
Getting Data from an external API...
Ultimate Plain Vanilla DOM JS & JQuery Cheatsheet
A Beginning oriented dive into databases...
Understanding Data and Databases 101
A Beginning oriented dive into databases...
Basics of Building a CRUD API with Node (no framework)
Learning the Node HTTP/HTTPS library...
Basics of Building a CRUD API with Typescript (NestJS and FoalTS)
Backend Frameworks with Typescript Support...
Basics of Building a CRUD API with NodeJS - Express, Koa and Fastify
Learning REST conventions with Javascript...
Basics of Building a CRUD API with Ruby Sinatra
Learning REST conventions with Ruby Sinatra...
Basics of Building a CRUD API with Flask or FASTApi
Learning REST conventions with Python Frameworks...
The Concepts of JAMStack 101
Speed, SEO and Security with JAMStack...
Intro to SvelteKit
The new Svelte Based Application Builder...
Basic Intro to NextJS
The next generation of Frontend Frameworks...
Ruby Sinatra with Postgres using Sequel
Connecting a Sinatra App to a Database...
Intro to Ruby Sinatra
Minimalist Ruby Web Framework...
10 Programming Languages Side by Side (JS, Python, Ruby, PHP, GO, Rust, Dart, C Sharp, Java, Ballerina)
Learn Languages by see what's similar and different...
How to Work with Masonite - Python Web Framework
Batteries Included Python Web Framework...
FoalTS - Building a Typescript Based API
Making an API with this Typescript Based Framework...
Python Flask 101 - Intro and API Building
Making an API with this Powerful Python Framework...
Creating Deno Scripts Like Node NPM Scripts
Replicating One of Nodes Greatest Features...
Creating an API with Deno (import maps, deps.ts, etc.)
Using that Cool New Javascript Runtime...
Using Docker & Docker Compose to Create an Express/Neo4J Dev Environment(Intro to Graph Databases)
The Joys of Docker...
Intro to Fastify & The Liquid Templating Language
Making Sure React Works...
The Basics of React Testing With Jest
Making Sure React Works...
React Conditional Rendering
React for Everyone...
Ultimate 2021 Reference for React Functional Components
React for Everyone...
The Wonderful World of Javascript Bundlers
Yay! Javascript Build Tools....
Ultimate Guide to Javascript Functions
What's your function?...
SolidJS - React meets Svelte?
The new shiny frontend toy!...
Pipenv - Yep, another post about Python Virtual Environments
Venv... the best way?...
Ultimate 2021 Guide to Deploying NodeJS (And DenoJS) Apps to Heroku
Heroku CLI, Continuous Integration and more...
Ultimate 2021 List of CSS Frameworks and Component Libraries for Angular, React, Vue and Svelte
Lots of them...
merced-express - Express with a Ruby on Rails Feel
Express... on Rails......
Ultimate Express & Mongo Reference
Building Backends with NodeJS...
Rust & Rocket - Zero to Deploy
Building APIs with Rust...
What is Batch and Streaming Data? (Data 101)
The fundamentals of data engineering...
Creating APIs with Dart & Google Shelf - Zero to Deploy
Using the Fast Growing Dart Language...
Intro to .Net 5 with VSCode - Zero to Deploy
Microsoft Goes Cross Platform...
API With Java Spring & VSCode, from zero to deploy
Using Javas Most Popular Framework...
Intro to Express, Templating and API's (EJS, Handlebars, Mustache, Pug)
Tales of PHP's Demise are Exaggerated...
API With GO Buffalo, from zero to deploy
Buffalo, The Rails of the Go World...
Hello World - Laravel 101 (From Start to Deploy)
Tales of PHP's Demise are Exaggerated...
More on Python Virtual Environments
Using Pythons Built in venv module...
Django Rest API without DjangoRestFramework
In case you were wondering...
Chart of Backend Web Frameworks 2021
Build your application or microservice...
Many Useful Javascript Tricks
Upping Your Javascript Gains...
Understanding Postgres on Linux
The Little Things that May not be obvious...
What is a Makefile and how do I use them?
Automating all the things...
Go, Rust and C++ Side by Side
Learn All The Things...
Understanding Node Streams with Https.get
Learn How to use Streams...
Javascript Basic Reference
Quickly find what you need...
Ultimate Command Line Reference 2021 - Bash, Git, Node, Python, Ruby, PHP
All the commands you'll need all the time...
Learn Python, PHP, Ruby and Javascript in one Blog Post
Be the Polyglot you know you can be!...
A Tale of Memory and the Garbage Collector
The Stack, The Heap, and Memory Management...
Python Virtual Environment 101
Understanding that Pesky Virtual Environment Thing...
Understanding Dependency Injection
What a Fancy Word...
How to use Netlify Cloud Functions
Your First Cloud Function...
How to use Vercel Cloud Functions
Your First Cloud Function...
Cloud Functions - Server-Side Code On Demand
Why you should embrace serverless tech...
Where Does My Code Run? - Compilers, Interpreters, Transpilers and Virtual Machines
Making Code Happen...
Tips for Aspiring Developers
Good Advice...
Creating a Bosque Programming Language Dev Environment in 2021
Microsoft Exeperimental Language...
Understanding and Using Environment Variables in React
Learning How to hide Data...
Intro to PHP
Learn the Pre-Hypertext Processor Language...
React Data Flow - Understanding State and Props
Learn the Pre-Hypertext Processor Language...
Understanding React Router
Learn the Pre-Hypertext Processor Language...
Intro to Building Backend Servers with KOAjs
Building a Web Server with Node and KOA...
Deploying React, Angular, Svelte and Vue to Netlify & Vercel
Getting Your Project Online...
Big List of Hosted Headless CMS Providers with Free or Developer Tier in 2021
Get Your JAM Stack ON...
Git - A Guide to Understanding and Using Git
All The Commands in Words that Make Sense...
Git/Github - Making the Switch from Master to Main
Getting with the times...
Deep Dive on Javascript Tooling (Bundlers, Linters, Oh MY!)
Node, ESLint, Babel, Bundlers...
React 101 - Basic JSON Blog from 0 to deployment
A Simple Build to Learn React...
5 Cool Things You Can Do In React
Cool React Tips...
Guide to Becoming a Developer in 2021
Step by Step Guide and Advice...
EZComponent -Open Source Frontend Framework using Web Components
Component Based Frontend Framework...
Getting Started Programming Ballerina 101
Web Application Architecture...
Understanding MVC (Models - Views - Controllers)
Web Application Architecture...
Go/Golang 101 - The Syntax and Basics
A Fun Language for Fast Compiled Apps...
Rust 101 - The Syntax and Basics
A Fun Language for Fast Compiled Apps...
Getting Started with Python Web Framework, FastAPI
The New Fast Web Framework in Python...
Javascript - Writing Map as a Recursive Function
Thinking Recursively...
In-Depth Guide on Understanding Deploying Web Apps
Servers, Ports and Environmental Variables oh my...
Create Your Dev Portfolio with this Gatsby Template
Template designed for Dev Portfolios...
Fundamentals of Client Side Javascript
The Building Blocks of Client-Side Javascript Master...
Ultimate CSS Reference
CSS All the Things!...
React - Why use TaskRunner over Redux, useReducer
A State Management Alternative...
Svelte after Sapper - The Svelte Ecosystem
Enterprise level frontend framework...
Creating a Portfolio/Blog with Angular/Agility CMS
An Introductory Angular Tutorial...
Delivering JSON Data with Netlify
Using Netlify to Deliver Static JSON...
Building a Blog with Agility Headless CMS
Using Headless CMS's...
Web Component Libraries
Libraries of Pre-Built Web Components...
Authorization and Authentication in Concept
Understand JWT, Sessions and Bcrypt...
Creating a Gatsby or NextJS Markdown Blog
Using merced-spinup templates...
6 JS Object Types You May Not Have Used
Getting Advanced with Javascript...
Tutorial - Writing Your First GraphQL API
An Alternative to RESTFul API...
1 Backend, 5 Frontends - Todo List with Rails, React, Angular, Vue, Svelte, and jQuery
Same app... different frontend frameworks...
Konjection - ORM Helper using Knex and Objection
Connect and Setup Your Models with Ease...
Ruby on Rails Tutorial - Heroku API Deployment
Deploy a Full Crud API Quickly!...
Ruby on Rails Tutorial - Many to Many Relationships
How to create a many to many relationship...
Ruby vs Javascript in Several Images
Learn Ruby through Javascript...
merced-react-hooks => Application State, LocalStorage, Lifecycle
Utility Hooks Library for React...
Making Framework Agnostic Web Components with StencilJS
Ionics Component Creation Tool...
Ruby on Rails Reference - CLI Commands, Bundler, Macros
All You need to know to be productive in Rails...
npx make-fullstack-app - Scaffolding your back and frontend
One Project, Two Apps, One Command...
Full Crud Mongo/Express API in One Line with MongoRester
Scaffolding Mongo/Express APIs with ease...
Understanding and Solving Cors Errors
Allowing Cross-Origin API Requests...
Axios, Fetch and other useful images!
Javascript in Review...
Create Your Next Microservice with Merver!
NodeJS Micro-Web Framework...
Using mutils to supercharge arrays!
Arrays with superpowers!...
Mongo, Mongoose and Express Reference
Queries and Endpoints oh my!...
Rollout Application Level State Quickly with useDataStore
Context, Reducers and Hooks, yes!...
npx create-react-loaded supercharged react
Router, GlobalState, Sass and more!...
Javascript Callback Array Methods
Map, Reduce, Some, Every, Filter, Find, FindIndex...
Intro to Express
Creating a Backend Server...
React Pro Tips in Several Images
Super Charge Your React Code...
More Merced-Spinup Templates
Gulp, Express and React!...
Learning Svelte 101
Cybernetically Enhanced Web Apps!...
Typescript 101 - Typing, Interfaces and Enums oh MY!
Super charging Javascript Scalability...
Writing Javascript Promises
Under the Hood of Asynchronous Javascript...
Big List of Online Places to Code/Prototype
Code on the Go...
React Cheat Sheet
Props, State, Forms, Classes, Functions...
React in Concept - The Terms and Idea
Props, State, Forms, Classes, Functions...
React Hooks Basics Reference
useEffect, useState, UseReducer and more!...
Passing Data Between Components in Vue
Props, Queries and Events oh my!...
Spin-up your next project with merced-spinup
Templates for days...
Mongoose, Connecting to Mongo via Javascript
Documents, Collections and Databases oh my!...
Promises 101 and Fetch, Axios and $.ajax
Asynchronous Javascript in Nutshell...
Building Your Coder/Developer Brand
Creating the Demand For You...
Ultimate jQuery/Plain Vanilla JS DOM Reference
The Basics of one of the most popular libraries ever...
Javascript Events - In the Browser and Node
A Comprehensive Guide to Javascript Events...
Ultimate Basic Coder Reference (Bash, Git, VSCode, Nodejs, more)
The Basics We Should all know...
Svelte - The New Kids on the Frontend Framework Block
Compiling based frontend framework...
Web Components Part 3 - Lifecycle Functions
Built in functions that run at certain points...
Guide to Free/Cheap Deployment Options 2020
Quality, Fast, Deployment for your static or full stack site....
Ultimate Django Reference (Deployment, Rest API, Commands, .env)
A one stop shop for many of the things you'll have to look up a lot...
Web Components Part 2 - Styling and Slots
Making Pretty and Flexible Components...
Web Components Part 1 - The Basics
Creating, ShadowDOM...
RenderBlocks
Beginners Tutorial...
mBlocks - Frontend UI Library
Beginners Tutorial...
MercedUI - Web Components with Super Powers
Beginners Tutorial...
Intro to Angular 9 Tutorial
Enterprise level frontend framework...
AMPonent, Webcomponent Building Library
Building Reactive and Styling UI Components...
Frontend CRUD with Plain Vanilla JS
A Basic Exploration of Frontend DOM Manipulation...
React 101 Tutorial
Basics of How React Works...
Ruby on Rails API with JWT Auth Tutorial
Creating a Ruby on Rails API with Auth...
Hello World in Vue
Firs Blog Post and Vue Tutorial...