Amazon Web Services (AWS) is guided by customer obsession, pace of innovation, commitment to operational excellence, and long-term thinking. By democratizing technology for nearly two decades and making cloud computing and generative AI accessible to organizations of every size and industry, AWS has built one of the fastest-growing enterprise technology businesses in history. Millions of customers trust AWS to accelerate innovation, transform their businesses, and shape the future. With the most comprehensive AI capabilities and global infrastructure footprint, AWS empowers builders to turn big ideas into reality. Learn more at aws.amazon.com and follow @AWSNewsroom
Sponsors
Platinum
Databricks is the data and AI company. More than 20,000 organizations worldwide — including adidas, AT&T, Bayer, Block, Mastercard, Rivian, Unilever, and over 60% of the Fortune 500 — rely on Databricks to build and scale data and AI apps, analytics and agents. Headquartered in San Francisco with 30+ offices around the globe, Databricks was founded by the original creators of the lakehouse architecture, Apache Spark™, Delta Lake, MLflow and Unity Catalog. To learn more, follow Databricks on LinkedIn, X, YouTube, and Instagram. Phone: 1-866-330-0121 | Contact Us
Dremio is the Agentic Lakehouse—the only data platform built for agents and managed by agents. We help organizations implement and scale AI by unifying data, enforcing governance, and providing business context through our semantic layer, enabling fast, accurate insights for business teams and AI agents. By automating lakehouse operations and delivering industry-leading price-performance, Dremio provides the fastest path to AI and analytics without pipelines, lock-in, or operational burden.
Etleap is a cloud-native data pipeline platform built for modern data foundations on Apache Iceberg. By unifying ingestion, transformation, and operations into a single system, Etleap helps data platform teams deliver trustworthy, reliable, reusable data for downstream use cases, including analytics and AI, without building and maintaining a custom pipeline platform.
Founded by experienced data engineering practitioners, Etleap focuses on removing the operational friction that holds teams back from adopting modern data foundations. For more information, visit https://etleap.com.
Google Cloud is the new way to the cloud, providing AI, infrastructure, developer, data, security, and collaboration tools built for today and tomorrow. Google Cloud offers a powerful, optimized AI stack with its own planet-scale infrastructure, custom-built chips, generative AI models and development platform, as well as AI-powered applications, to help organizations transform. Customers in more than 200 countries and territories turn to Google Cloud as their trusted technology partner.
Microsoft helps organizations unlock the full value of their data through open, interoperable analytics platforms. With Microsoft Fabric and OneLake, customers can work with Apache Iceberg–based data across engines and clouds—without duplication—enabling faster insights and AI at scale. Committed to open standards, security, and ecosystem collaboration, Microsoft partners with the data community to simplify analytics and power intelligent decision‑making.
Ryft is the Automated Iceberg Management Solution. We help data teams create a truly open, automated and cost-effective Iceberg lakehouse, by maintaining and optimizing Iceberg tables in real time, all based on actual usage patterns. Ryft also automates governance, GDPR compliance and data lifecycle so data stays secure and compliant.
Snowflake delivers the AI Data Cloud — a global network where thousands of organizations mobilize data with near-unlimited scale, concurrency, and performance. Inside the AI Data Cloud, organizations unite their siloed data, easily discover and securely share governed data, and execute diverse analytic workloads. Wherever data or users live, Snowflake delivers a single and seamless experience across multiple public clouds. Snowflake’s platform is the engine that powers and provides access to the AI Data Cloud, creating a solution for data warehousing, data lakes, data engineering, data science, data application development, and data sharing. Join Snowflake customers, partners, and data providers already taking their businesses to new frontiers in the AI Data Cloud.
Gold
CelerData is a sub-second SQL engine powered by StarRocks that unifies real-time analytics, lakehouse, and AI workloads. It scales to millions of users under strict SLAs, syncs mutable data without redundant ETL, and supports Apache Iceberg, vector search, and full-text search on a governed platform.
Established in 2009, ClickHouse leads the industry with its open-source column-oriented database system, driven by the vision of becoming the fastest OLAP database globally. The company empowers users to generate real-time analytical reports through SQL queries, emphasizing speed in managing escalating data volumes.
Cloudera is the only data and AI platform company that large organizations trust to bring AI to their data anywhere it lives. Unlike other providers, Cloudera delivers a consistent cloud experience that converges public clouds, data centers, and the edge, leveraging a proven open-source foundation. As the pioneer in big data, Cloudera empowers businesses to apply AI and assert control over 100% of their data, in all forms, delivering unified security, governance, and real-time predictive insights. The world’s largest organizations across all industries rely on Cloudera to transform decision-making and ultimately boost bottom lines, safeguard against threats, and save lives.
Firebolt is the analytical database for teams running production grade customer-facing analytics, ELT workloads and AI agents, with unparalleled price-performance ratio.
MinIO is the data foundation for enterprise AI. Built for exascale performance and limitless scale, MinIO AIStor delivers a secure, sovereign, and AI-ready data store that spans from edge to core to cloud. With rampant adoption across the Fortune 100 and 500, MinIO is redefining how organizations and government agencies store, manage, and mobilize all of their data in the AI era. MinIO is backed by Jerry Yang’s AME Cloud Ventures, Dell Technologies, General Catalyst, Index Ventures, Intel Capital, Softbank Vision Fund 2 and others.
StarTree is the real-time analytics database platform built on Apache Pinot, powering customer-facing applications and continuous operational insights for companies including DoorDash, Stripe, Cisco, and others. At this year’s Iceberg event, we’re showcasing SLA-driven analytics on Apache Iceberg. SLA-driven analytics means guaranteed sub-second query performance at P99, guaranteed data freshness, even as concurrency and data volumes scale. Critically, this is achieved without reverse ETL pipelines or caching data outside of Iceberg to meet SLAs.
StreamNative is the real-time data and agent infrastructure company—delivering lakehouse‑native streaming for Kafka and Pulsar, plus Orca Agent Engine, a governed runtime where AI agents safely run on that live data. Powered by the Ursa engine, StreamNative reimagines Kafka‑compatible streaming with a leaderless, lakehouse‑native design that writes streams directly to Apache Iceberg and Delta Lake tables—reducing cluster sprawl and helping teams cut infrastructure costs without changing applications. Orca provides an event‑driven runtime for deploying, coordinating, and scaling always‑on agents on the same streaming backbone, under the governance and observability of the platform. Together, StreamNative helps enterprises keep context fresh, cut streaming sprawl, and put agentic AI into production without losing control.
Silver
Since 2016, dbt Labs has been on a mission to help data practitioners create and disseminate organizational knowledge. dbt is the standard for AI-ready structured data. Powered by the dbt Fusion engine, it unlocks the performance, context, and trust that organizations need to scale analytics in the era of AI. Globally, more than 80,000 data teams use dbt, including those at Siemens, Roche and Condé Nast. Learn more at getdbt.com, and follow dbt Labs on LinkedIn, X, Instagram, and YouTube.
Fivetran delivers the trusted data foundation enterprises need to scale analytics, operations, and AI with confidence. By unifying data movement, management, and transformation, Fivetran enables a secure, reliable, and portable foundation for AI across clouds, engines, and tools. Leading organizations including OpenAI, Verizon, and Pfizer rely on Fivetran to help data teams operate at peak productivity, turning raw data into intelligent experiences that drive competitive advantage. Learn more at Fivetran.com.
Redpanda is a data streaming platform whose Iceberg Topics component automatically transforms Apache Kafka®-compatible messages into Apache Iceberg™ tables in real-time. This allows users to query their real-time streaming data in an established Iceberg deployment — no connectors or additional technology required. Redpanda Iceberg Topics integrate with an expanding list of Iceberg catalogs and query engines, including Databricks, Snowflake, Google BigQuery, AWS Glue, and more. With the new Redpanda Agentic Data Plane (ADP) you can also integrate your analytical systems with agentic systems with our AI gateway, which provides governance, efficiency, and observability.
Labs: RisingWave Labs is the company behind RisingWave, a unified system that integrates storage, real-time transformations, interactive queries, and native Apache Iceberg™ support. RisingWave OpenLake—Managed Iceberg™ Tables: Built by the main contributors to the Apache Iceberg Rust project, RisingWave Open Lake is the easiest way to launch and scale your Apache Iceberg–based lakehouse. It brings together catalog hosting, streaming data ingestion, and automatic table maintenance—all while providing full interoperability so every query engine can truly access your data in a fresh and consistent manner. Available in open source and in our fully managed service.
Starburst is the flexible data platform that delivers fast, secure access to all your data wherever it lives. Built on an open data stack with Trino and Apache Iceberg, Starburst unifies distributed data across clouds, on-premises, and in hybrid environments, without complex or costly migrations, unleashing the full power of the data lakehouse for analytics and AI.
With our Lakeside AI architecture, organizations gain federated access, governed collaboration, and full data lineage, empowering compliant, scalable AI. Trusted by global leaders, Starburst helps enterprises turn data into business value.
From insights to action to AI, Starburst fuels innovation at every level. Learn more at starburst.ai.
Tower builds the “last mile” production platform for data engineers in the AI era. It enables teams to turn AI-generated and human-curated data pipelines into reliable, production-ready systems. Tower unifies Python compute with storage based on open Apache Iceberg standards, ensuring AI-enriched apps stay grounded in fresh, company-specific data.
Bronze
Unifying the Data Lifecycle: From Stream to Search on Open Standards. No Silos.
The promise of a modern data lakehouse is often undermined by “architecture sprawl”—the need to manage separate silos for streaming, relational data, and search. Aiven breaks these barriers by providing a single, open-source foundation built for the Apache Iceberg ecosystem.
Artie makes real-time data streaming simple for modern data teams. It’s a production-ready system for replicating operational data into analytics and AI platforms in real time, without years of internal platform work. Artie automates the full ingestion lifecycle. This includes change capture, transactional merges, backfills, schema evolution, and observability. Pipelines remain correct and reliable as they scale. Teams use Artie to keep systems like Apache Iceberg continuously up to date for real-time analytics and AI in production. Artie scales to billions of change events per day and is trusted by teams at Substack, ClickUp, and Alloy to ship faster and scale with confidence.
The Cloudflare Developer Platform is a global, serverless ecosystem that allows developers to build and deploy full-stack applications directly on Cloudflare’s massive edge network. Cloudflare extends this infrastructure into a complete data platform that enables developers to ingest, store, and query massive datasets directly on Cloudflare’s edge. By combining Cloudflare Pipelines for data ingestion and stream processing, R2 Data Catalog for managed Apache Iceberg tables, and R2 SQL for high-performance distributed queries, traditional egress fees and infrastructure complexity are eliminated, providing a truly serverless and cost-effective analytics platform.
Datastrato is building the open data fabric platform to accelerate trusted AI. The company is the original creator of Apache Gravitino, unified metadata platform for AI – multi-cloud, multi-engine and multi-modal.
Eon is changing the cloud data protection space by introducing a new storage tier that turns backups into immediately accessible data lakes — seamlessly automated, radically cost-efficient, and instantly usable for AI and analytics.
Espresso AI uses machine learning to automatically optimize Snowflake and Databricks warehouses. Founded by ex-Googlers, we apply research from DeepMind to instantly reduce Snowflake and Databricks SQL costs.
How does it work?
Espresso AI is Kubernetes for Snowflake: we intelligently route queries across warehouses to increase utilization and cut cost. Under the hood, we operate a proxy that sits between you and your Snowflake instance. This proxy has an ML-powered backend that understands how your workloads scale and parallelize, enabling real-time routing decisions that can automatically cut your bill by up to 70%.
Founded in 2003, LinkedIn is the world’s largest professional network with more than 1 billion members in more than 200 countries and territories worldwide. The mission of LinkedIn is simple: connect the world’s professionals to make them more productive and successful.
LinkedIn Infrastructure powers the data platforms behind the Economic Graph, enabling engineers to process and analyze massive datasets at global scale. Our teams build and operate distributed storage, compute, and analytics systems that support large-scale data processing, AI innovation, and reliable member experiences. We actively collaborate with the open source community, including the Apache ecosystem, to advance modern data infrastructure.
OLake is an open-source data ingestion and replication platform that syncs data from PostgreSQL, MongoDB, and Kafka into Apache Iceberg for near real-time analytics. Natively built for Iceberg, is one of the fastest tools to get your data into Iceberg.
Built for scalable, CDC-friendly pipelines with UI and CLI workflows, OLake is evolving beyond ingestion with upcoming table observability and optimization features to improve data reliability and Iceberg table performance.
OLake comes with deployment options like docker-compose and helm for its OSS offering.
PuppyGraph is the first and only real time, zero-ETL graph query engine in the market, empowering data teams to query existing relational data stores as a unified graph model in under 10 minutes, bypassing traditional graph databases’ cost, latency, and maintenance hurdles. Capable of scaling with petabytes of data and executing complex 10-hop queries in seconds, PuppyGraph supports use cases from enhancing LLMs with knowledge graphs to fraud detection, cybersecurity, and more. Trusted by industry leaders, including AMD, Coinbase, Netskope, Palo Alto Network, eBay, and more. Learn more at www.puppygraph.com.
Teradata Autonomous AI and Knowledge Platform activates enterprise intelligence by unifying data, knowledge, and business context, for real-time outcomes. Enterprises can combine these for impact, connecting and scaling across any environment whether it’s in the cloud, on premises, or hybrid to deliver the full value of AI. Learn more at Teradata.com.
VeloDB, powered by Apache Doris, is a real-time analytics and search database for diverse workloads, bringing real-time analytics and search capabilities wherever your data lives. Powered by Apache Doris, it enables users to query, search, and aggregate data in a single environment, simplifying the architecture. With advanced indexing, storage optimization, and a vectorized MPP engine, VeloDB enables consistent, low-latency, high-concurrency for customer-facing analytics, AI applications, and observability. Native integration with modern table formats, such as Iceberg and Delta tables, brings the governance and flexibility of an open data lakehouse while maintaining database-grade speed. Whether deployed in the cloud or on-premises, VeloDB simplifies architecture, accelerates decision-making, and powers modern analytics with unmatched performance, cost-efficiency, and architecture simplicity.
Zilliz is a global leader in vector database technology, delivering open source and enterprise solutions for next generation AI applications. Founded in San Francisco, Zilliz raised 112M from top investors. Its core project Milvus is the most widely adopted open source vector database, with 42000 plus GitHub stars and 100M plus downloads. Zilliz Cloud provides a fully managed multi cloud service across AWS Google Cloud and Azure, powering RAG AI agents semantic search and other critical AI workloads worldwide.
Want to sponsor THE Iceberg conference of the year?
Please contact [email protected] for more details about sponsorship opportunities.