Category Archives: AI

AI News Briefs BULLETIN BOARD for March 2026

Welcome to the AI News Briefs Bulletin Board, a timely new channel bringing you the latest industry insights and perspectives surrounding the field of AI including deep learning, large language models, generative AI, and transformers. I am working tirelessly to dig up the most timely and curious tidbits underlying the day’s most popular technologies. I know this field is advancing rapidly and I want to bring you a regular resource to keep you informed and state-of-the-art. The news bites are constantly being added in reverse date order (most recent on top). With the bulletin board you can check back often to see what’s happening in our rapidly accelerating industry. Click HERE to check out previous “AI News Briefs” round-ups.

[3/5/2026] Awesome R Stats Skills – AI agent skills are reusable prompt files (typically SKILL.md) that extend coding assistants like Claude Code with specialized, domain-specific workflows. This list focuses on skills useful for R users.

[3/5/2026] Microsoft built Phi-4-reasoning-vision-15B to know when to think — and when thinking is a waste of time – Phi-4-reasoning-vision-15B is a compact open-weight multimodal AI model that matches or exceeds the performance of systems many times its size. The 15-billion-parameter model can process both images and text and can reason through complex math and science problems, interpret charts and documents, and navigate graphical user interfaces. It is available now through Microsoft Foundry, Hugging Face, and GitHub under a permissive license. The model was trained on only approximately 200 billion tokens of multimodal data, much less than competing models.

[3/5/2026] GPT-5.4 reportedly brings a million-token context window and an extreme reasoning mode – OpenAI’s upcoming GPT-5.4 model will have a one-million-token context window, putting it on par with Google and Anthropic’s offerings. The model will feature an ‘extreme’ thinking mode that lets it burn significantly more compute on tough questions. It is expected to be more reliable and make fewer mistakes on longer tasks that can run for several hours. OpenAI’s more frequent model release cadence is designed to keep expectations in check. The hype around GPT-5’s launch set a high bar that was nearly impossible to clear, and the company’s user growth has recently fallen short of internal projections.

[3/5/2026] Google introduces a training method that teaches LLMs to update probabilities when new evidence appears – Google researchers introduced a method called Bayesian teaching that trains large language models to update their beliefs as new information appears. This solves a common limitation in LLM agents. Many models treat each interaction independently and fail to adapt when users provide new signals. Bayesian teaching trains models on examples generated by a Bayesian assistant that follows optimal probability updates. By mimicking those decisions, the LLM learns how to maintain uncertainty, weigh evidence, and improve predictions over time.

Key details:

  • Uses supervised fine-tuning on simulated user interactions and recommendation tasks
  • Teaches models to update predictions across multiple interaction rounds
  • Learns probability updates instead of memorizing correct answers
  • Off-the-shelf LLMs plateau after one interaction in recommendation tasks
  • Bayesian assistant reaches 81% accuracy in the evaluation benchmark
  • Bayesian-trained models match the assistant’s predictions ~80% of the time

[3/4/2026] I Had Claude Read Every AI Safety Paper Since 2020, Here’s the DB – A compiled database of nearly 4,000 AI safety papers since 2020 aims to streamline finding relevant research, despite the overwhelming volume, which often lacks substance. Claude AI assisted in summarizing, tagging, and compiling papers using citation-based methods to overcome challenges in searchability and dataset availability. This database helps quickly access pertinent research and datasets, crucial for AI safety projects.

[3/4/2026] Gemini 3.1 Flash‑Lite – Google introduced Gemini 3.1 Flash‑Lite, a low‑cost, high‑speed model designed for large‑scale developer workloads. The model offers faster latency and output speeds than 2.5 Flash while costing $0.25 per million input tokens and $1.50 per million output tokens.

[3/4/2026] GPT‑5.3 Instant – OpenAI released GPT‑5.3 Instant, an update focused on improving conversational flow, answer relevance, and web search results in ChatGPT. The model also reduces unnecessary refusals and overly defensive responses to produce more direct answers.

[3/4/2026] Introducing the Arm x NVIDIA Developer Community – Arm and NVIDIA have been powering AI workloads from cloud to edge for years. Now, we’re bringing that collaboration directly to developers. Introducing the Arm x NVIDIA Developer Community—a dedicated technical space for those building across Arm CPUs and NVIDIA GPUs, from Grace and DGX to Jetson. Connect with the teams behind the platforms and explore the latest technical resources.

Key details:

  • Direct access to Arm and NVIDIA engineers on Discord
  • Live code sessions, deep-dives, workshops, and hackathons
  • Updates on the latest cross-stack Learning Paths
  • Project spotlights and community recognition

[3/3/2026] Andrew Ng Says AGI Is Decades Away—and the Real AI Bubble Risk Is in the Training Layer – Andrew Ng, founder of DeepLearning.AI and Coursera, executive chairman of Landing AI, and founding lead of the Google Brain team, says that AI capable of performing the full breadth of human intellectual tasks remains decades away. He recently appeared in an interview where he discussed enterprise adoption of agentic AI, whether AI is in a bubble, the AI infrastructure build-outs, geopolitical fragmentation and its effects on global AI strategy, and more. This post contains a transcript of that interview.

[3/3/2026] Why XML Tags Are so Fundamental to Claude – Structuring prompts with XML can be a transformative experience in Claude. The AI’s framework specifically incorporates XML tags as key elements. The repurposing of XML technology may be a core aspect of what makes Claude distinctive. It gives Claude the ability to distinguish between the transition from first-order to second-order expressions, which is a mechanism fundamentally required for information transfer between any two entities. Claude’s awareness of the concept of delimiters is crucial to every processing and communication of information, and it is this capacity that makes Claude so effective at interpreting layered meaning.

[3/3/2026] Alibaba’s small, open source Qwen3.5-9B beats OpenAI’s gpt-oss-120B and can run on standard laptops – Alibaba recently unveiled its Qwen3.5 Small Model Series. Qwen3.5-0.8B and 2B are intended for prototyping and deployment on edge devices where battery life is paramount. Qwen3.5-4B is a strong multimodal base for lightweight agents with a 262,144 token context window. Qwen3.5-9B is a compact reasoning model that outperforms OpenAI’s open source gpt-oss-120B on key third-party benchmarks. The weights for the models are available now under Apache 2.0 licenses on Hugging Face and ModelScope.

[3/3/2026] Anthropic vs. White House puts $60 billion at risk – The $60 billion investment into Anthropic from over 200 venture capital investors is now at risk due to a contract dispute with the Pentagon. Anthropic being designated a supply chain risk is unprecedented. It will prevent other military contractors from deploying Claude in their applications. It could also require companies like NVIDIA, which do business with the US military, to sever their commercial activities with Anthropic. The situation is becoming an existential moment for all of American AI and its investors.

[3/2/2026] The #1 misconception I see in beginner data science: correlation = causation. I teach it this way: “Correlation helps you find where to look. Causation tells you what to do.” Big difference.

[3/2/2026] MIT just mass released their Al library for free! – I love MIT Press books, and I use many of them regularly. Most people pay thousands for bootcamps that teach half of this. Bookmark it. Start anywhere. Just start.

1. Foundations of Machine Learning

2. Understanding Deep Learning

3. Algorithms for ML

4. Deep Learning

5. RL Basics (Sutton & Barto)

6. Distributional RL

7. Multi-Agent Systems

8. Long Game Al

9. Fairness in ML

10. Probabilistic ML (Part 1)

11. Probabilistic ML (Part 2)

[3/2/2026] Marc Andreessen: The real AI boom hasn’t even started yet – Featured on Lenny’s Podcast, Marc Andreessen is a founder, investor, and co-founder of Netscape, as well as co-founder of the venture capital firm Andreessen Horowitz (a16z). The conversation digs into why we’re living through a unique and one of the most incredible times in history, and what comes next.

[3/2/2026] Sakana AI releases open Doc-to-LoRA and Text-to-LoRA, generating LoRA adapters in a single forward pass – Sakana AI introduces Doc-to-LoRA and Text-to-LoRA, two systems that let you update large language models without running a new fine-tuning job.

Instead of retraining a model or stuffing long documents into the prompt, you train a separate model called a hypernetwork once. That hypernetwork generates small weight updates called LoRA adapters in a single forward pass. You attach the adapter to a frozen base model and get new knowledge or new skills instantly.

Key details:

  • Avoids expensive fine-tuning pipelines for each new task
  • Removes repeated long-document prompts and high memory use
  • Cuts per-update latency to sub-second adapter generation
  • Near-perfect zero-shot accuracy beyond 4× base context window
  • 75.03% accuracy on Imagenette via visual-to-text weight transfer
  • Matches performance of adapters trained on 9 benchmark tasks
  • Train the hypernetwork once on representative tasks
  • Provide a document or task description at deployment
  • Generate a LoRA adapter and attach it to your base model

[3/2/2026] Anthropic introduces Claude Import Memory to transfer context from ChatGPT and Gemini – Instead of retraining a new assistant about your stack, tone, and ongoing projects, you transfer that information in one step. Claude stores it as persistent memory, which means it saves details across sessions and reuses them later. This solves a common problem: starting from zero every time you try a new model.

Key details:

  • Imports preferences, workflows, and project context in one copy-paste
  • Updates Claude’s long-term memory instantly
  • Reduces repeated setup across conversations
  • Maintains continuity across tools
  • Works on all paid Claude plans
  • Copy Anthropic’s import prompt.
  • Paste it into your current assistant.
  • Generate the memory summary.
  • Open Claude → Settings → Memory.
  • Paste and save.

Claude then responds using your imported context in future chats.

Together AI Announces Business And Product Milestones At First AI Native Conference

Event showcases breakthroughs in AI infrastructure, open source research and reinforcement learning

Together AI, the AI Native Cloud powering some of the world’s fastest-growing AI companies, today launched AI Native Conf, its first-ever conference dedicated to builders creating the next generation of AI applications. The event comes amid rapid business momentum for Together AI, which now serves thousands of customers, supports over one million developers, and has achieved 10x year-over-year growth in annual contract revenue (ACR), including 27 customer deals exceeding $1 million and one exceeding $1 billion.

Together AI has emerged as a core infrastructure provider for leading AI-native companies including Cursor, Decagon, and Cartesia, delivering production-scale inference, pre-training and model shaping. With an industry-leading systems research lab led by the creators of FlashAttention and ThunderKittens, Together AI sits at the intersection of frontier research and real-world deployment.

“AI is moving faster than any technological shift we’ve seen before, and the companies being built today look fundamentally different,” said Vipul Ved Prakash, co-founder and CEO of Together AI. “This event is about bringing together the AI-native builders at the frontlines and sharing what it actually takes to run AI in production at scale. Our advantage is simple: the same researchers who publish foundational work are the ones shipping it into production systems our customers rely on.”

Announcing New Research Breakthroughs and Products From The AI-Native Cloud

At AI Native Conf, Together AI unveiled new research-to-production advancements across kernels, reinforcement learning, and inference optimization, underscoring the company’s deep research bench and rapid development cadence. These advancements will help companies improve training and inference performance to enable businesses of all sizes to capitalize on the benefits of generative and agentic AI.

Key announcements include:

  • FlashAttention 4, the latest evolution of the widely adopted kernel now powering most major language models in production. FlashAttention 4 delivers up to 4x performance improvements at long sequence lengths, narrowing the gap between theoretical and real-world performance for long-context workloads like coding agents and document reasoning.
  • A new Reinforcement Learning API that decouples inference and training, enabling globally distributed reinforcement learning pipelines that were previously only feasible for organizations with massive, co-located GPU clusters.
  • ThunderAgent, an open-source, program-aware system for serving and training agentic workloads, delivering up to 3.6x throughput improvements and significantly reduced memory overhead.
  • ATLAS-2, which uses real-time user data to adapt and optimize, immediately delivering 1.5X faster inference results.

A Gathering for the AI-Native Generation

AI Native Conf was created in response to the rapid emergence of AI-native startups. Generative AI is already being used by 70% of companies, according to McKinsey, making it the fastest-adopted major technology platform in history. As a result, a new class of AI-native companies is scaling at unprecedented speed, with many reaching $100 million in ARR faster than any previous generation of startups.

The conference features leaders building at the frontier of AI, including Grant Lee, co-founder and CEO of Gamma, and Arjun Desai, co-founder of Cartesia, alongside researchers and engineers deploying AI systems at massive scale.

For a complete recap of research and product announcements from AI Native Conf, visit today’s post on the Together AI blog.

QuitGPT Campaign Protests Outside OpenAI HQ in San Francisco

Rally Comes as ChatGPT Uninstalls Surge 295% and Claude Downloads Hit All-Time High

On Tuesday, March 3, 2026, members of QuitGPT, a grassroots boycott campaign with over 2.5 million supporters, gathered outside OpenAI’s San Francisco headquarters to protest the company’s deal with the U.S. Department of Defense to deploy AI for autonomous weapons systems and mass surveillance of American citizens.

The protest happened amid a dramatic consumer backlash. According to market intelligence provider Sensor Tower, U.S. uninstalls of the ChatGPT mobile app jumped 295% day-over-day on Saturday, February 28, compared to a typical daily uninstall rate of just 9%. One-star reviews for the app surged 775% on the same day, and U.S. downloads of ChatGPT fell 13% day-over-day. The reaction represents one of the sharpest single-day consumer revolts against a major tech platform in recent memory.

The decision stands in stark contrast to that of competitor Anthropic, which declined a similar request from the Department of War last week, citing concerns that AI would be used to surveil American citizens and to operate in fully autonomous weapons systems that AI is not yet safe to control. The public response was swift: Anthropic’s Claude app saw downloads surge 37% day-over-day on Friday, February 27, and a further 51% on Saturday. Claude climbed to the No. 1 spot on the U.S. App Store by Saturday — a rise of over 20 ranks in under a week. Data provider Appfigures noted that Claude’s total daily U.S. downloads surpassed ChatGPT’s for the first time ever.

“OpenAI had a choice, and they chose contracts over conscience,” said a QuitGPT organizer. “When over 1 million people join a boycott and consumers deleting apps by the hundreds of thousands, the message is clear: the public will not accept AI being weaponized against them. We’re here today to make that unmistakably clear.”

QuitGPT is calling on the public to cancel their OpenAI subscriptions and take a pledge at quitgpt.org. The organization has managed to get over 1.5 million people to engage on their demand through a sustained campaign for corporate accountability in the AI industry.

Cybersecurity Heavyweights Launch JetStream with $34M Seed Round to Bring Governance to Enterprise AI

Backed by Redpoint Ventures, CrowdStrike Falcon Fund, CrowdStrike CEO George Kurtz, Wiz CEO Assaf Rappaport, and Okta Vice-Chairman Frederic Kerrest, the company was founded by veteran security operators with a mission to accelerate enterprise AI success

JetStream Security, a new company founded by veteran security operators from CrowdStrike, Dazz, SentinelOne, Cohesity, McAfee, and Attivo Networks, today launched its AI governance platform. JetStream raised $34 million in a massively over-subscribed seed round led by Redpoint Ventures, with participation from the CrowdStrike Falcon Fund. Cybersecurity luminaries like George Kurtz (CrowdStrike), Assaf Rappaport (Wiz), and Frederic Kerrest (Okta) are some of its blue-chip angel investors.

Companies are racing to deploy AI agents, bots, applications, MCP Servers, and custom built models that they don’t fully understand, can’t clearly monitor, and struggle to control. Most technology leaders still can’t answer basic questions: what data is being accessed, how AI systems behave, who is accountable when there’s an AI incident, or what they really cost. When AI scales faster than governance, risk becomes the adoption constraint. 

Today, 93% of executives experience challenges with implementing AI governance and security guardrails, indicating how AI controls are ripe for innovation. At the same time, expectations are rising: more than 80% of CEOs are increasingly optimistic about the ROI of their AI investments, yet half believe their own jobs are at risk if those investments fail to deliver.

JetStream brings clarity to this chaos, giving enterprises unified visibility and control, so that AI becomes a strategic asset, not a hidden liability. The company’s thesis is that AI is ready for takeoff – but trust in AI remains nascent. Lack of trust is the main blocker to wider adoption and the reason so many organizations experience difficulty moving from pilot to production. Trust requires governance and security control capabilities that span the entire AI lifecycle. Trust is what enables leadership to give the green light for the production use of AI.

At the core of the platform is JetStream AI Blueprints™, which are dynamic, system-generated graphs of all the resources working toward a shared goal. They show how AI operates in an environment at any moment. Each Blueprint maps the relationships among agents, the models they use, the data they access, the tools they call, and the identities behind every action, whether human, agentic, or non‑human. Blueprints differentiate with real runtime behavior tracking rather than static diagrams. They flag deviations from the authorized purpose, but can be updated to reflect new changes through an authorization workflow. Blueprints also track the cost of each workflow, showing what every agent run costs and who is responsible for that spend. In short, a Blueprint is the operational contract for an AI workflow and serves as a single source of truth for all your AI assets. It makes behavior and cost visible, attributable, and governable.

“AI is moving faster than most organizations can manage,” said Raj Rajamani, CEO and co-founder of JetStream. “Leaders are being asked to bet their businesses and careers on their systems they can’t fully see, explain, or control. That’s where trust breaks down. With AI Blueprints™, we give teams a clear, practical way to understand what their AI is doing, manage risk in real time, and move from experimentation to production with confidence. Our goal is simple: help companies scale AI responsibly, without slowing innovation.”

JetStream’s mission is to accelerate enterprise AI adoption by delivering a governance‑grade inspection and control layer that helps enterprises see, understand, and manage how agents operate across the enterprise. JetStream establishes agentic identity governance and controls for virtual workforces while maintaining agent‑level cost controls without slowing innovation.

“The pace of AI innovation is moving faster than most enterprises can safely absorb,” said Erica Brescia, Managing Director at Redpoint Ventures, a JetStream investor. “What stood out to us about JetStream is not just the product as an answer to major challenges, but also the team behind it. These are operators who’ve previously been ahead of every major security shift, and we trust them to stay ahead as agentic AI reshapes how organizations operate.” 

JetStream’s founders have led product, engineering, and go-to-market functions from seed to IPO and beyond at some of the most influential security companies of the last decade, including CrowdStrike, SentinelOne, Cohesity, Attivo Networks, Cylance, and McAfee. They have built platforms that protect the world’s largest enterprises, scaled organizations through hyper-growth, and navigated the security challenges that emerge each time a new technology paradigm takes hold.

JetStream’s seed round closed in a matter of weeks, reflecting strong demand from both investors and enterprises. The company is already working with F500 organizations, and plans to expand rapidly across engineering, product, and go-to-market teams.

doubleAI’s WarpSpeed Beats a Decade of Expert-Engineered GPU Kernels — Every Single One of Them

AI system achieves 3.6x average speedup over human experts across all tested algorithms and GPU architectures, marking the arrival of Artificial Expert Intelligence

doubleAI today announced WarpSpeed, the first Artificial Expert system to autonomously surpass world-class human experts in GPU performance engineering. WarpSpeed rewrote and re-optimized every kernel in NVIDIA’s cuGraph library — one of the most widely used GPU-accelerated graph analytics libraries in the world — delivering a 3.6x average speedup over a decade of expert-tuned code. The hyper-optimized library is now available on GitHub as a drop-in replacement requiring no code changes.

Key Results

  • 3.6x average speedup over human expert-written kernels
  • 100% of algorithms tested run faster with WarpSpeed
  • 55% of kernels achieve more than 2x improvement
  • Validated across three GPU architectures: NVIDIA A100, L4, and A10G

Why This Matters

cuGraph has been built and continuously refined by some of the world’s top GPU performance engineers over roughly a decade. It spans dozens of graph algorithms, each hand-optimized for maximum throughput. WarpSpeed beat every single one of them — on every tested GPU.

While AI has earned headlines for winning gold medals at the International Mathematical Olympiad and outperforming top programmers on competitive coding platforms like CodeForces, those achievements share three hidden advantages: abundant training data, easy-to-verify solutions, and short reasoning chains. GPU performance engineering breaks all three assumptions simultaneously:

  • Data scarcity: Only a few thousand truly optimized CUDA kernels exist publicly.
  • Validation complexity: Correctness is hard to verify — multiple valid solutions exist, and simple diffs are insufficient.
  • Deep reasoning chains: Optimal performance emerges from a long chain of interacting decisions — memory layout, warp behavior, caching strategy, scheduling, and graph structure.

Even state-of-the-art coding agents, including Claude Code, Codex, and Gemini CLI, fail dramatically in this domain — often producing incorrect implementations even when provided with cuGraph’s own test suite. In testing, leading coding agents produced buggy solutions in approximately 40% of tasks, making them unusable for real-world kernel replacement.

A New Paradigm: Artificial Expert Intelligence (AEI)

WarpSpeed represents the beginning of what doubleAI calls Artificial Expert Intelligence (AEI) — not Artificial General Intelligence (AGI), but something the world may need more urgently: AI systems that reliably surpass human experts in domains where expertise is rarest, slowest to develop, and most valuable.

“The real question isn’t ‘can AI code?’ — it’s ‘can AI become an expert?’” said Prof. Amnon Shashua, cofounder and CEO. “Humanity’s progress is bottlenecked by experts. If we can copy and paste expertise into the world, the impact is transformative.”

The Science Behind WarpSpeed

WarpSpeed’s results stem not from scaling alone, but from new algorithmic ideas developed by doubleAI’s research team:

  • Diligent Learning: A method for efficiently searching in the space of ideas, enabling AI to navigate vast design spaces and converge on high-quality solutions even when training data is scarce.
  • PAC Reasoning and Verification : A methodology for verifying correctness when ground truth is unavailable. Built on two components — Input Generation (IG), which finds challenging test inputs, and Automatic Validation (AV), which determines whether outputs are correct given a problem description. Grounded in the computational insight that verification is simpler than search, this approach allows AI to bootstrap reliable self-verification even in domains where it cannot yet solve the original problems.

These components create a flywheel: better verification engines produce better training data, which train stronger experts, which generate more sophisticated verification — and the cycle continues.

Availability

WarpSpeed-optimized cuGraph kernels are available today on GitHub at https://github.com/double-ai/doubleGraph . Users can install the optimized library with no changes to their existing code.

Looking Ahead

GPU hardware has long outpaced the software that runs on it. Every new architecture ships faster silicon, but the kernels and algorithms underneath lag behind — bottlenecked by the scarcity of engineers who can fully exploit it. WarpSpeed closes that gap: AI that keeps software in lockstep with hardware, unlocking the full potential of modern GPUs and opening the door to use cases that were previously out of reach.

cuGraph is a stress test. If AEI works in a domain where data is scarce, validation is hard, and the baselines are elite, then AEI can work wherever expertise is the bottleneck — from drug discovery and chip design to cybersecurity, robotics, and climate technology.

Support for the Agent2Agent protocol helps connect AI agents anywhere in real time so they can collaborate at enterprise scale

Multivariate Anomaly Detection takes anomaly detection to the next level, stopping problems before they start

Confluent, Inc. (Nasdaq: CFLT), the data streaming pioneer, today announced new Confluent Intelligence capabilities that connect artificial intelligence (AI) agents and uncover more accurate, intelligent data analysis. Confluent’s Streaming Agents use the Agent2Agent (A2A) protocol to trigger and coordinate external AI agents using real-time data streams, making it easier to connect AI systems across an enterprise. Multivariate Anomaly Detection looks at multiple metrics to automatically spot unusual patterns in data streams, helping teams prevent issues with greater accuracy—before they cause outages or downstream impacts. Together, these capabilities create intelligent context-aware AI systems that adapt as data, agents, and business conditions change.

“If you want to be competitive, your AI can’t be looking in the rearview mirror,” said Sean Falconer, Head of AI at Confluent. “You need a system of AI agents that work together and constantly learn and share insights in real time. Confluent Intelligence connects teams’ AI investments and systems no matter where they’re built—so AI can automatically react to live data, take action, coordinate systems, and escalate to team members as needed.”

Build Collaborative Agent Ecosystems

Businesses are increasingly turning to AI agents to automate decisions and handle more complex work. According to the IDC FutureScape: Worldwide Future of Work 2026 Predictions, “By 2026, 40% of all G2000 job roles will involve working with AI agents, redefining long-held traditional entry-, mid-, and senior-level positions.” And even that’s likely a conservative estimate. But as agents spread across tools and systems, most operate in isolation. If agents can’t communicate with each other or share context across a business, insights get trapped in silos and decisions are fragmented.

Confluent’s Streaming Agents addresses this by connecting AI agents to real-time data with Anthropic’s Model Context Protocol (MCP) and to other agents with the A2A protocol. Together, they can continuously analyze information from agent frameworks such as LangChain, data platforms like BigQuery, Databricks, and Snowflake to generate insights, then trigger enterprise AI platforms like Salesforce and ServiceNow workflows to take immediate action—closing the gap between insight and execution. By connecting these systems, Confluent turns stream-level analysis into “insight to action” generating the real-time intelligence needed to quickly adapt as business needs change.

With A2A support in Streaming Agents, teams can:

  • Build smarter, reusable AI agents: Feed existing agents and systems with fresh context from Confluent to asynchronously respond to events and take further actions.
  • Unlock inter-agent communication and auditability: Capture every agent action in an immutable log for auditability and replayability. Leverage Apache Kafka® to orchestrate communication between agents and to reuse agent outputs across other agents and systems.
  • Centralize orchestration and governance in one place: Streaming Agents acts as the orchestrator, and Confluent ensures governance, security, and end-to-end observability for all agent interactions.

Teams in all industries can use A2A support in Streaming Agents to drive higher revenue, to lower risk, and to save on costs. Streaming Agents can personalize offers in retail, reduce credit risk underwriting in financial services, automate care recommendations in healthcare, predict maintenance in manufacturing, and proactively remediate outages in telecommunications.

A2A support in Streaming Agents is now available in Open Preview.

Act on Live Signals and Eliminate Blind Spots

Businesses generate more data than ever, yet they struggle to understand what’s important and what can be ignored. Anomaly detection surfaces threats and opportunities that no human could spot on their own. Traditional anomaly detection often analyzes metrics in isolation and is frequently restricted to batch-based analysis on historical data. Relying on simple statistical baselines, these systems are highly sensitive to noise, spikes, and bad data. Without context, they can generate false positives, and they typically surface issues after they’ve already impacted the system.

Confluent’s Multivariate Anomaly Detection, a new feature of the built-in Machine Learning (ML) Functions, analyzes related metrics together to reduce false positives and catch real issues faster. It allows teams to detect anomalies across multiple metrics while ignoring data outliers, ensuring higher accuracy for complex data monitoring. Teams can start using Multivariate Anomaly Detection immediately since they don’t need to build or update the model, which learns as data changes.

In addition, teams can:

  • Understand a system’s healthy state: Traditional anomaly detection tools rely on averages, which can get thrown off by a single random spike in data. Confluent’s Multivariate Anomaly Detection uses ML that reacts and learns with teams’ real-time data to ignore one-off glitches and understand systems better.
  • Recognize complex problems and patterns: Confluent’s Multivariate Anomaly Detection analyzes multiple metrics together as a unified group, such as looking at CPU, memory, and latency combined, instead of just one at a time, to find patterns. Now, teams can uncover complex issues that would otherwise be missed if they looked only at individual metrics.
  • Act automatically: By constantly measuring how far new data points are from the “true normal,” data that drifts too far is instantly flagged as an anomaly.

Axelera AI Secures More Than $250 Million Funding on Global Commercial Growth

European AI semiconductor leader earns backing from funds managed by Innovation Industries and SiteGround Capital along with EU institutions; edge-first architecture addresses AI’s critical energy and cooling constraints

Axelera AI, the European leader in AI acceleration hardware, announced its latest funding round led by Innovation Industries, with participation from prominent funds and accounts including BlackRock and SiteGround Capital as new investors, as well as existing investors Bitfury, CDP Venture Capital, European Investment Council Fund, Federal Holding and Investment Company of Belgium (SFPIM), Invest-NL, Samsung Catalyst Fund, and Verve Investments. Axelera AI has attracted over $450 million in equity, grants and venture debt since incorporating in July 2021.

The largest investment ever in an EU AI semiconductor company comes as Axelera ships to its 500th global customer across physical AI and edge AI in industries including defense and public safety, industrial manufacturing, retail, agritech, robotics, and security, firmly establishing the company as the global leader in power-efficient AI inference solutions.

Axelera AI’s success is rooted in a fundamental insight: to deploy AI at scale, the industry must first solve for energy consumption and cooling requirements. The company’s edge-first architectural approach delivers uncompromising AI inference performance that fits within the power and thermal envelopes of real-world deployment environments to drive real business value. And by providing high performance at the edge, companies can process data locally, which preserves privacy for their users and supports the increasing demand for sovereign solutions.

“Data centers are hitting power and cooling limits, and as analytics move closer to where data is being created, edge AI solutions must operate within strict energy and bandwidth constraints,” said Fabrizio Del Maffeo, CEO and co-founder of Axelera AI. “We designed our architecture from the ground up to overcome these obstacles. Our edge-first approach isn’t just about efficiency; it’s about making AI deployment economically viable at scale for real-world applications while protecting data and privacy by processing customer information locally.”

Inference is projected to be a more than $250B market by 2030[i]. Over the life of a model, the cost of inference is 15x more than training and utilization is growing at 31x per year[ii]. But many organizations still struggle to transition from AI projects to generating value in production. Axelera’s tightly co-designed hardware/software solution simplifies deployment and maximizes performance of inference-based workloads. Axelera AI’s Europa and Metis platforms deliver the price/performance balance enterprise customers need while operating within the energy and thermal constraints of edge computing.

Rising Above Market Fragmentation

The edge AI semiconductor market has attracted over $60 billion in venture funding in just the past three years[iii], creating significant fragmentation and confusion for customers. Axelera AI’s strong financial foundation, proven technology, customer traction, scaled manufacturing through partnerships with TSMC and Samsung, and growing ecosystem of software and integration partners position the company for long-term growth and success.

“Axelera is solving one of the most fundamental constraints in Edge AI adoption: the cost and energy efficiency of inference at scale,” said Rogier Ketelaars, investment manager at Innovations Industries. “We believe the company is uniquely positioned to become a foundational player in the next generation of AI infrastructure, and we’re excited to back the outstanding Axelera team that combines deep technical leadership and real commercial execution.” 

The funding round represents strong institutional validation of this architectural philosophy, with BlackRock’s participation underscoring the financial community’s recognition that solving AI’s infrastructure constraints is critical to the technology’s continued growth and market expansion.

Unique Ecosystem Approach Drives Market Accessibility

Recognizing that hardware alone doesn’t drive adoption, Axelera AI has intentionally built an ecosystem that makes AI acceleration accessible to a broader market. The company’s Partner Accelerator Network, launched last year, represents a differentiated go-to-market approach that brings together software vendors, model makers, system integrators, solution providers, and technology partners to accelerate customer deployment and reduce time-to-production.

In addition, Axelera AI’s significant investment in software and deep commitment to usability enables AI developers to easily integrate Axelera AI’s acceleration into existing workflows without extensive redesign. By solving for both the technical constraints of energy and cooling, and the practical constraints of cost, ecosystem accessibility and software usability, Axelera AI removes the barriers to AI deployment at scale.

The added capital will accelerate Axelera AI’s manufacturing scale, expand its customer success organization and Partner Accelerator Network, and continue advancing the software tools and SDK that have made the company’s platforms accessible to AI developers worldwide.

[i] AI Inference Market worth $254.98 billion by 2030 – Exclusive Report by MarketsandMarkets™

[ii] Training vs. Inference: The $300B AI Shift Everyone is Missing

[iii] AI Chip Market to Grow 10x in the Next Ten Years and Become a $300 Billion Industry – Edge AI and Vision Alliance

Red Hat AI Factory with NVIDIA Accelerates the Path to Scalable Production AI

New co-engineered offering combines Red Hat AI Enterprise and NVIDIA’s accelerated computing software to provide a unified foundation for building, deploying, and scaling AI-enabled applications

Red Hat, a leading provider of open source solutions, today announced the Red Hat AI Factory with NVIDIA, a co-engineered software platform that combines Red Hat AI Enterprise and NVIDIA AI Enterprise to provide an end-to-end AI solution optimized for organizations deploying AI at scale. Red Hat AI Factory with NVIDIA is the latest milestone in the companies’ deep collaboration, accelerating the delivery of the newest AI innovations to enterprise customers today while also delivering Day 0 support for NVIDIA hardware architectures.  

With enterprise AI spending expected to reach over $1 trillion by 2029, driven in large part by agentic AI applications, organizations are looking to shift their strategies toward high-density, agentic workflows and address the resulting demands on AI inference and infrastructure. To help organizations keep pace, Red Hat AI Factory with NVIDIA empowers IT operations teams to streamline management of both traditional infrastructure and the evolving demands of the AI stack. 

“The shift from AI experimentation to industrial-scale, enterprise-wide production requires a fundamental change in how we manage the AI computing stack,” said Chris Wright, chief technology officer and senior vice president, Global Engineering, Red Hat. “We’re accelerating the path to deploy AI and move quickly to production using Red Hat AI Factory with NVIDIA. With a stable, high-performance foundation driven by our proven hybrid cloud offerings, we’re enabling our customers to own their AI strategy and scale with the same rigor they apply to their core IT platforms.”

Red Hat AI Factory with NVIDIA accelerates the path to production AI, from provisioning the underlying infrastructure to fueling higher performance for the models and GPUs driving the inference stack. This empowers IT administrators and operations teams to scale and maintain AI deployments with the same operational rigor and predictability as any enterprise workload.

This co-engineered software platform integrates the open source collaboration, engineering and support expertise of both Red Hat and NVIDIA to deliver a trusted, enterprise-grade solution. The Red Hat AI Factory with NVIDIA provides a highly scalable foundation for AI deployments across any environment, whether on-premises, in the cloud or at the edge. It includes core capabilities for high-performance AI inference, model tuning, customization and agent deployment and management, with a focus on security. This allows organizations to maintain architectural control from the datacenter to the public cloud, delivering:

  • Accelerated time-to-value: Advance to production AI with streamlined workflows and instant access to pre-configured models, including the indemnified IBM Granite family, NVIDIA Nemotron, and NVIDIA Cosmos open models, delivered as NVIDIA NIM microservices. Additionally, organizations can further align models to enterprise data using NVIDIA NeMo, reducing tuning time and cost. 
  • Optimized performance and cost: Maximize infrastructure usage and bolster inference performance with a unified, high-performance serving stack. Red Hat AI Factory with NVIDIA delivers built-in observability capabilities and taps Red Hat AI inference capabilities powered by vLLM, NVIDIA TensorRT-LLM, NVIDIA Dynamo, and NVIDIA BlueField to meet strict AI service level objectives. This helps organizations reduce the total cost of ownership (TCO) for AI by optimizing the connection between models and NVIDIA GPUs.
  • Strengthened enterprise posture: Leveraging the flexible and stable foundation of Red Hat Enterprise Linux, organizations benefit from advanced security and compliance capabilities built-in from the start that help to lower risk, save time and mitigate downtime. This delivers a security-hardened foundation for mission-critical AI workloads that require isolation and continuous verification.

“The next era of enterprise AI is about real-time action and tangible business return, and that requires an industrial-strength, hybrid foundation,” said Vlad Rozanovich, senior vice president, Infrastructure Solutions Group, Lenovo. “We can bring a scalable, enterprise-grade platform that combines Lenovo’s inferencing-optimized infrastructure with offerings like Red Hat AI Enterprise and the Red Hat AI Factory with NVIDIA, to give customers the real-time advantage – a resilient foundation for agentic AI that is deployable and manageable anywhere they operate.”

AI Industry Predictions for 2026

Welcome to the Radical Data Science (RDS) annual technology predictions round-up! The AI industry has significant inertia moving into 2026. In order to give our valued global audience a pulse on important new trends leading into next year, we here at RDS heard from many of our friends across the vendor ecosystem to get their insights, reflections and predictions for what may be coming. We were very encouraged to hear such exciting perspectives. Even if only half actually come true, AI in the next year is destined to be quite an exciting ride. Enjoy!

[NOTE: The last updates have been made! The commentaries below are in no particular order.]

Daniel D. Gutierrez – Principal AI Industry Analyst, Influencer & Resident Data Scientist

AI and automation will lead to more business disruptions; organizations will have to tackle this challenge head-on — As AI-driven response becomes embedded in security operations center (SOC) workflows, organizations can experience a new class of self-inflicted outages. AI systems will confidently take “correct” actions without grasping business context, such as locking out key authentication pathways or shutting down critical operations to contain perceived anomalies. This will require less tolerance for relying too much on AI and accepting such inconsistencies. In 2026, companies will need to stop accepting “the AI did it” as an excuse and formalize human-in-the-loop governance to prevent AI-triggered business downtime. – Steve Holmes, Gurucul

Prevention is dead in cybersecurity – By 2026, the myth of prevention as a primary strategy will be fully exposed. Attackers are faster, smarter, and more patient than ever, leveraging AI, deepfakes, and malware that can remain undetected for months, bypassing traditional defenses. Many vendors will continue to overemphasize prevention, presenting it as innovation while moving away from detection and response, but this approach is increasingly ineffective. Breach rates are rising 17 percent year over year, with 55 percent of organizations affected in the past 12 months alone, highlighting that relying solely on perimeter defenses is no longer sufficient.

This acceleration makes real-time detection, removal, and complete visibility critical. Organizations that implement continuous risk assessment, monitor third-party ecosystems, and maintain visibility into encrypted traffic where most threats now hide will gain a decisive advantage. Aligning AI initiatives with security priorities further ensures defenses keep pace with adversaries. In this landscape, resilience is not about keeping every threat out; it is about seeing, stopping, and learning from threats in real time. Prevention alone is a pipe dream; the companies that survive and thrive will be those that detect and remove threats before damage is done. – Shane Buckley, president and CEO at Gigamon

As companies move LLM models from pilot projects to fully deployed production systems, they will need AI Observability tools that can address not just model-related performance issues (like model drift, data quality and hallucination identification and prevention) but also can manage the health and performance of the underlying infrastructure systems. Rapid, sub-minute identification of anomalies and root cause analysis across both models and underlying infrastructure systems will be essential to an organization’s ability to capitalize on the promise of enterprise-scale AI. – Helen Gu, founder of InsightFinder AI 

2026 Kubernetes predictions – As AI workloads shift from training to massive-scale inference, SRE (Site Reliability Engineering) teams are about to feel even more pressure. GPU-heavy computing is breaking the assumptions today’s clusters were built on, while enterprises are beginning to trust autonomous operations and cost pressure is pushing consolidation across the cloud-infrastructure stack. Based on these forces, here are my 2026 Kubernetes predictions as well as some best practice recommendations to help platform teams prepare for what reliable operations will mean next year. – Itiel Shwartz, CTO and cofounder of Komodor

  • As AI/ML use continues to increase more workloads will move from training to inference. Even the new GKE experiments are showing signs of this, as the huge number of nodes that they scale up with contain a significant amount of inference workloads.
  • AI SRE will make a significant adoption impact. As more organizations deploy cloud native infrastructure, and GenAI cutting time to market for their competitors, platform teams will understand that to continue to innovate and lead, they need to scale up their SRE teams. With Kubernetes experts at a premium, AI SRE will prove to be the missing ingredient that allows them to adapt.
  • Cloud operations will start to move towards autonomy. As more and more AI powered tooling is adopted, and users trust it more, we will see a movement among traditionally conservative enterprises towards allowing some operations to be autonomously managed by AI.
  • Cloud-native job queueing systems, like Kueue will see a major uptick in adoption, as the race for deploying HPC, AI/ML, and even quantum applications heats up. Since previous queue systems are not built for this scale, new tooling will quickly be implemented across the industry.
  • With applications and workloads relying on more compute than ever before, Kubernetes scheduling will require a makeover. The current pod-centric approach will not be able to handle this increased scale, so a more workload specific approach for the scheduler will be required. The community is actively working on this through KEP-4671: Gang Scheduling, which will be managed natively in K8s.
  • GPU overprovisioning will become a more pressing problem. As the macro economic climate continues to push towards greater efficiency, organizations will have to find ways to optimize their GPU monitoring and usage.
  • FinOps tools will start to consolidate with other products in the cloud infrastructure stack. Similar to what is happening in cloud security, products will consolidate different capabilities, including observability, insights, tracing, cost optimization and troubleshooting, into a single platform. This will remove cognitive load from teams struggling to keep up with too many dashboards and products.

The Rise of Decoupled Observability Stacks – In 2026, the era of the all-in-one observability black box will be over. AI is driving massive growth in logs, metrics, and traces, pushing tightly coupled observability platforms past their limits. Organizations are reaching a breaking point: they can no longer scale these monolithic systems without sacrificing data visibility or having to absorb runaway costs.

The cost and complexity of scaling current observability stacks will become unsustainable. Forward-thinking teams are already starting to rethink architecture, pulling apart the data layer from the tools that sit on top of it. We’ve seen this movie play out before – business intelligence went through the same evolution over the last 40 years. It started as tightly coupled stacks in the 80s and exists today as a decoupled architecture that gives teams flexibility, choice, and control. The separation gave rise to the Snowflakes, Databricks, Fivetrans, and Tableaus of the world. Observability is next.

The observability warehouse (i.e., specialized data stores for logs, metrics and traces) will emerge as the new standard, serving as a central data layer that reduces dependence on any one monolithic platform, freeing teams from vendor lock-in and letting them choose the best tools for the job. – Eric Tschetter, Chief Architect at Imply

Investment may increasingly shift toward fueling new human insight and the systems that preserve it. While most industries face data exhaustion, health tech remains uniquely advantaged, sitting on a vast, still-untapped trove of medical records, clinical notes, and real-world evidence. As AI models in other domains begin to plateau from recycled training data, healthcare has the rare opportunity to keep improving by responsibly unlocking, structuring, and digitizing the knowledge already within its walls. – Melvin Lai, Senior Venture Associate, Silicon Foundry

AI agents will move from non-production to production

Today, most AI agents still run outside production environments. That will change as organizations connect agents directly to live systems and workflows. At scale, this shift will force enterprises to actively manage agent permissions, lifecycle controls, and accountability. – Itamar Apelblat, CEO & Co-Founder, Token Security

The Data Backbone Becomes the Compliance Frontline — In 2026, organizations will find that compliance is no longer just about ticking boxes, it will hinge on the quality and structure of their data. Clean, unified data will be the foundation for leveraging AI to detect risks, enforce policies, and anticipate regulatory scrutiny. Companies that treat data architecture as a strategic asset will not only simplify compliance but also gain a competitive edge in using AI to understand complex operational environments. – Chase Doelling, Principal Strategist & Director at JumpCloud 

Hybrid AI Architecture Becomes the Enterprise Default — Enterprises will lead the way by deploying Small Language Models (SLMs) combined with Retrieval-Augmented Generation (RAG) on purpose-built AI hardware at the edge to offset the rising cost and latency of frontier LLMs. There will be an architectural shift toward hybrid AI data pipelines. Frontier models will be used for reasoning and edge SLMs for contextualization to redefine cost-per-insight as the new efficiency metric for AI operations. Enterprises will look to consolidate communication and collaboration tools to a smaller number of vendors. Users will desire to use one application for chat, calling, SMS/MMS messaging, and meetings both for internal collaboration and external communications. – Doug Ford, VP, Solutions Portfolio & Technical Readiness, All Covered

AI Agents Will Become Self-Building — In 2025, agentic AI went mainstream, but most solutions still required significant engineering lift: specialized teams, custom code, and months of hand-holding just to get the first workflow running. Next-generation AI agents will be created through low-code and no-code platforms designed for business users, not technical experts. Instead of relying on engineering teams, organizations will build, deploy, and iterate agents themselves. As companies gain the ability to create and maintain agents internally, deployments will become faster, cheaper, and more aligned with real operational needs. – Aquant co-founder, Assaf Melochna

2026 will be the year of AI’s security refresh — In 2025, we saw organizations take a shotgun approach to AI implementation. Organizations who joined the AI race invested a lot into AI technology in hopes of increasing productivity and ROI for the business. The rapid implementation of AI has introduced more vulnerabilities to the market. For example, we saw the emergence of prompt injection attacks,  where threat actors manipulate a LLM to bypass instructions and force unintended actions. With researchers discovering security vulnerabilities and AI code and implementations such as Google Gemini and Microsoft Copilot, a new category of threats has begun to emerge.. Rushing AI implementation, without the proper safeguards in place, sets a dangerous precedent and is creating security gaps that can lead to a downstream effect that impacts organizations of all sizes.

In 2026, organizations will double down on fixing and re-developing poorly implemented AI with specialized controls, and reinforce proven security frameworks to address vulnerabilities introduced by AI tools that were rushed to market. MSPs and small and mid-market IT teams will treat AI oversight like code review to catch risks before they hit production and protect clients’ trust. – Amanda Berlin, Senior Product Manager of Cybersecurity at Blumira

Why vector search should still be in your toolbag — While vector search isn’t nearly as comprehensive as GraphRAG, it still performs very well for simple retrieval tasks. We’ll see GraphRAG, and possibly other advanced techniques, used to synthesize data across complex organizational systems, providing LLMs with structured context and helping reduce hallucinations.

However, for many problems, vector search remains a perfectly sensible option and should always be considered. In other words, for simpler use cases, vector-only approaches are entirely adequate; they can get you in the ballpark and deliver good results. I expect engineers will continue testing vector search on information retrieval cases where they already know the answers, as a way to evaluate their LLMs. – Memgraph CEO, Dominik Tomicevic

2026 will be the year of Bring Your Own AI — Enterprises will stop looking for one-size-fits-all models and start layering AI into their architecture with flexibility. That shift will require orchestration platforms that can route tasks across GPT, Claude, Gemini, or open-source tools depending on use case, compliance needs, and performance. – Bryan Cheung, CMO of Liferay

CAPEX IT becomes a dinosaur — Next year, organizations clinging to CAPEX-heavy infrastructure models will find themselves in a major jam. As AI and economic volatility create a need for speed, agility, and cost predictability, rigid, asset-heavy IT strategies will collapse. The winners will be those who shift decisively to OPEX-based models, consuming infrastructure like a utility, scaling instantly, and paying only for what they use.Innovation can’t wait for a three-year hardware refresh cycle. Companies that fail to pivot to OPEX approaches will be outmaneuvered by competitors who have adapted themselves for change. – Vadim Vladimirskiy, CEO and co-Founder, Nerdio

Conversational Commerce Becomes the Default Front Door — AI-powered shopping will explode as consumers turn to conversational agents as the first stop for brand discovery and comparison. The next wave will see ‘shopping agents’ handling recommendations, price checks, and even purchasing, completely redefining how people engage with brands online. – Sarah Molloy, Director of Strategic Partnerships, Brij

For years, cybersecurity has focused on locking down devices. We’ve wrapped them in management software, antivirus, and access controls, all in an attempt to contain data that should never have been there in the first place.

Every year, enterprises pay more and more for corporate devices under the assumption they’re keeping corporate data separate from users’ personal devices. However, with apps like Outlook, Excel, and Google sheets all accessible on mobile devices, a breach of a personal device is a breach of enterprise data.

Moving into next year, AI-generated exploits will continue to be created and deployed in minutes. Our mobile device driven society has extended the attack surface beyond the control of IT and Cybersecurity staffs. We must concentrate on reducing the attack surface and protect our proprietary, sensitive and personal data with the same level of care. – CSO of Hypori, Matt Stern

AI Oversharing Trend — An emerging phenomenon called “AI oversharing” is where enterprise AI applications expose sensitive information not through attacks or breaches, but through poorly defined access controls. This is prevalent in popular Retrieval-Augmented Generation (RAG) architectures if the proper roles and permissions of the original data sources aren’t enforced. This continues to be one of the most significant yet underreported data privacy risks organizations face today.  – Oliver Friedrichs, CEO and Co-Founder, Pangea

AI may reduce operational inefficiencies, helping to accelerate growth for private practice owners — By eliminating operational inefficiencies—from documentation and scheduling workflows to claims and billing processes—AI technology can enable practices to scale their efforts, increase patient volumes, and grow the practice. With fewer errors, claims may be approved faster and reimbursements distributed in much shorter time. No show rates can drop, and patient billing may become more streamlined. With AI technology integrated into daily operations, private practices may be better positioned to remain financially resilient and thrive. – Nupura Kolwalkar-Rana, AdvancedMD’s Chief Product & Technology Officer

Impact on the energy sector — Rapid AI adoption is creating unprecedented energy demands that current power grids are not engineered to handle. The strain is colliding with the rising threat of sophisticated cyberattacks targeting critical infrastructure like power grids and pipelines, creating a new class of compounding risk. Service disruptions may become a normalized challenge, forcing many organizations to rethink operational resilience. – Adam Khan, VP, Global Security Operations at Barracuda  

Hackers will breach an AI application in 2025 — and then they will manipulate the AI application to cause problems in the target company. Organizations will need to start treating an AI application like a person, much in the same way as we did for bots not too long ago. – Bruce Esposito, Senior Manager of IGA Strategy and Product Marketing at One Identity.

A major Copilot-driven breach exposes the risks of AI over-permissioning: 2026 will see a headline-grabbing incident where Microsoft Copilot accesses sensitive data or executes privileged actions beyond its intended scope. As organizations rush to deploy AI copilots across productivity, code, and cloud environments, many will grant broad permissions “to keep things working.” This over-permissioning, combined with implicit trust in AI automation, will lead to unauthorized data exposure or lateral movement. The incident will force enterprises to adopt granular permission controls, audit trails, and continuous monitoring for AI assistants — treating them as powerful identities, not productivity add-ons. – Rob Rachwald, Vice President, Veza

Data Mesh Architecture: Decentralization of data ownership will become more prevalent, allowing teams to manage their own data as products. This will be particularly beneficial for large organizations seeking independent, high-quality data exchange. – Arnab Sen, VP of Data Engineering at Tredence

Ditching the cloud, moving data back to data centers — In 2026, enterprises will begin migrating select workloads and sensitive data from the public cloud back into their own data centers. The “trillion-dollar paradox,” as Andreessen Horowitz described it, is forcing business leaders to face a hard truth: the cloud’s convenience often hides long-term cost and control tradeoffs. The agility that once justified the cloud premium has become a drag on profitability. We will see more organizations move back to the data center because of the fear that the data entered into the cloud will be consumed by public LLMs. A number of organizations have private LLMs to do their AI work on-premises.  

Customers want tighter control over sensitive data and less exposure to cloud outages or the risk that public large language models will ingest proprietary information. The next phase of cloud adoption will look more balanced. Companies will keep what makes sense in the cloud and bring home the workloads that do not. Many will take a hard look at what they are paying for and what they gain in return, then move critical systems back into environments they can fully control. This shift will create more hybrid models that help organizations cut waste, tighten security, and make more informed decisions about where to store their most sensitive data based on cost, performance, and regulatory needs. – John Kindervag, Chief Evangelist at Illumio

Companies will hit an AI operations wall as projects scale from pilots to dozens of implementations — Technology and security leaders will face an AI operational bottleneck, struggling to scale from isolated pilots to enterprise-wide implementations. Industries that rely on complex data ecosystems like finance, manufacturing and healthcare will be particularly vulnerable to conflicting data pipelines, inconsistent architectures and uneven security practices. Without AIOps frameworks and strong governance structures, organizations risk losing visibility, control of their tech stacks and long-term operational resilience. – Siroui Mushegian, CIO at Barracuda 

While it is true that the security risks from AI models are continuing to grow both because of their capabilities and the stepped attacks against the guardrails these models have in place, it’s also important to not overhype the threats. We’ve already seen a couple of reports this year that exaggerated the threats that AI models currently pose. While we have reported an uptick in interest and capabilities of both nation state and cyber criminal threat actors when it comes to AI usage, these threats do not exceed the ability of organizations following best security practices. That may change in the future, however, at this moment it is more important than ever to understand the difference between “hype” and reality when it comes to AI and other threats. – Allan Liska, threat intelligence analyst, Recorded Future.  

The New Age of Deception: The Threat of AI Identity — In 2026, identity becomes the main target. Flawless, real-time AI deepfakes (like “CEO doppelgängers”) will make it impossible to tell a fake from a real person. This risk is huge because autonomous agents outnumber humans by an 82:1 ratio. We face a trust crisis where one forged command can start an automated disaster. Identity security must change from just blocking attacks to actively enabling the business by securing every human, machine, and AI agent. – Palo Alto Networks

AI adoption in 2026 will feel familiar — Most enterprises will continue using agentic AI to automate repeatable tasks and augment existing processes, not reinvent them. Only one in 5 organizations report getting meaningful value from their AI tools at the current time with key adoptions challenges being cost and lack of control mechanisms in context of the desired outcomes. Autonomous business intelligence will remain niche because the foundations including infrastructure required are simply not ready: data quality, governance maturity, and organizational skills still lag far behind the ambition.

Modernization efforts will remain the primary focus. Companies will keep working through the practical realities and motions to replace platforms like VMware and Citrix, while using SaaS to accelerate outcomes where it makes sense. At the same time, compliance and regulatory pressure will intensify. Leaders will need a clear understanding of sovereignty requirements, new operating models, and the talent divide between “old way” and “new way” practitioners.

In 2026, CIOs will be planning for what IT must look like in 2030. The problems they solve today will not be the ones they face next and there is a lot of pressure on the IT suite to ensure companies are ready and competitive as the AI  transformation gains momentum. – Niels van Ingen, SVP Business Development and Strategy, Keepit

Foundation Model providers compete fiercely for token share — Major model providers have amassed massive funding rounds and achieved lofty valuations. The demand for their services has grown exponentially so far but 2026 will see providers catch up and begin competing for share rather than growing freely into green-fields as they struggle to meet committed targets.

A token is the unit of AI processing (so called input tokens) or generation (output tokens, roughly 4x the price of inputs). A text token can be as short as a single character or as long as a word, depending on the language model and how it was trained.  Measuring token usage instead of $ revenue provides a clearer picture of AI demand.

As many open models increase in capability, the foundation model providers are locked in a struggle to differentiate their offerings while simultaneously making it easier to build AI into everything, and fend off competitors seeking the same work.  We’re already seeing them invest across the board in consumer applications, enterprise platforms, development tools, partnerships, marketing, commitment-based pricing and more.

One area that remains wide open is non-text modalities like audio and video, specifically when needed in real-time such as for conversational interaction.  This is a massive opportunity that, so far, is owned by OpenAI.  Unlike text, these modalities replace human time second-for-second, making ROI instantly clear when it works. You can skim a 10-page document in a moment but you have to talk to a voice representative in real time. Another way they’ll do it is to make sure the tokens are used in more applications. OpenAI simplified the process of integrating their platform into many software “surfaces”. They provided more of the plumbing needed to deliver their tokens so that adoption will rise. Microsoft is doing this also, as are a number of third parties. – Mike Finley, Co-Founder, StellarIQ

Cost breaks down as barrier for building great AI models — Until recently, it took tons of money to build high-performance AI models, and hefty budgets to run inference on them. But DeepSeek R1 turned that paradigm upside down when it matched the performance of top-tier models at fractions of both training and inference cost. In November, Kimi K2 Thinking topped the benchmarks on Humanity’s Last Exam and BrowserComp, with a reported training cost of only $4.5 million (ChatGPT training runs are estimated at $500 million) and inference cost of $2.50 per million output tokens, one sixth the price of Claude Sonnet 4.5. We’ll see this trend intensify in 2026 as more low-cost, high-performance models take off, many of which have open weights.

This completely changes the economics of who gets to play in this space. Suddenly, you don’t need to be a mega-corporation with unlimited capital to build capable AI systems. That shift matters more than people realize because it opens the door for specialized solutions that solve specific, real-world problems instead of trying to be everything to everyone. As more players enter the market with tailored, cost-effective solutions, the competition will push innovation further, enabling industries that were previously out of reach to access powerful AI technology. Open-weight, state of the art models can be run in private clouds, providing the security that CISOs demand for highly regulated industries. This democratization of AI will accelerate the pace at which businesses can address unique challenges and create impactful solutions. – Shanti Greene, Head of Data Science and AI Innovation, AnswerRocket

Effective AI is going to hinge on the trusted data underneath – In 2026, ‘explainable AI’ is going to mean ‘explainable data’. Regulators won’t just ask what a model did, they’ll ask which data made it behave that way and who changed that data last. And as AI becomes embedded in decision-making, the C-level are going to be demanding more explainability. The ability to trace AI inputs and outputs across data pipelines is going to define trustworthy AI. Lineage will therefore become the new audit trail for AI ethics, accountability, regulatory assurance etc. – Philip Dutton, CEO of Solidatus

Why Identity Intelligence Will Separate Market Leaders from Breach Headlines – In 2026, identity will either be your company’s strongest differentiator, or its weakest link. We’re entering an era where AI is both transforming business and transforming fraud. The cost is not just revenue loss, but long-term reputational damage, regulatory exposure, and a complete erosion of customer trust. Many companies are still relying on outdated verification methods such as static data, passwords, and fragmented KYC checks, while attackers are using tools that didn’t exist two years ago. This asymmetry will define the winners and laggards in the next phase of digital business.

Identity verification must become continuous, adaptive, and anticipatory, predicting and preventing risk before it occurs while remaining nearly invisible to the end user. It represents the evolution from a point-in-time identity check to a continuous, connected understanding of who someone truly is.

Identity intelligence brings together data across identity, historical, behavior, and risk checks to build a dynamic view of a user over time. Instead of verifying once and hoping for the best, organizations can continuously assess trust in the background, adapting to new signals as they emerge. Because when fraud happens, customers don’t blame the criminal, they blame the brand. The leaders who understand that digital trust and identity intelligence form the foundation of a modern business model, not just a security protocol, will be the ones who scale safely, expand globally, and protect their reputation. – Robert Prigge, CEO, Jumio

AI adoption is redefining cybersecurity risk, yet the ultimate opportunity is for defenders. While attackers utilize AI to scale and accelerate threats across a hybrid workforce, where autonomous agents outnumber humans by 82:1, defenders must counter that speed with intelligent defense. This necessitates a fundamental shift from a reactive blocker to a proactive enabler that actively manages AI-driven risk while fueling enterprise innovation. – Wendi Whitmore, Chief Security Intelligence Officer at Palo Alto Networks

AI coding agents will amplify identity misconfigurations – Coding agents will accelerate development, but also generate identity misconfigurations at scale. Hard-coded credentials, mis-scoped tokens, over-privileged service accounts, and flawed entitlement mappings will propagate through IaC and DevOps pipelines, creating systemic identity debt. – Ido Shlomo, CTO & Co-Founder, Token Security

Lema AI Raises $24M to Replace ‘Check-the-Box’ Compliance with the First  Agentic AI Built to Secure the Enterprise Supply Chain 

Trusted by Fortune 500 companies, Lema’s agentic AI platform replaces  compliance-driven checklists with continuous forensic analysis that  maps the vendor attack surface inside the enterprise, empowering enterprises to eliminate critical blind spots before they become  business-critical incidents. 

Enterprise supply chains now depend on thousands of third party vendors—and yet existing solutions focus solely on manual compliance validation, creating  significant blind spots in how companies manage risk. Lema AI, an agentic AI security platform  that empowers enterprises to build resilient, secure partnerships with their global vendors, today  emerged from stealth with $24 million in funding. The Series A was led by Team8, with F2 Venture  Capital leading the Seed round and participation from Salesforce Ventures. 

From SaaS applications to payment platforms, third-party vendors have become the operational  core of the modern enterprise. Gartner reports that 60% of companies now rely on over 1,000  external vendors, creating a vast attack surface that static, point-in-time compliance forms  cannot secure. While these vendors are external, they hold insider access to sensitive internal  systems and data, meaning a single compromise can quickly become an enterprise-wide incident.  A McKinsey report reveals that nearly one-third of recent cyber breaches originated from third  parties – yet most organizations still defend this expanding perimeter with static spreadsheets  and manual checklists 

Powered by an AI agent trained to think like a vulnerability researcher, Lema reveals the risks  that genuinely threaten the business. Rather than solely automating compliance workflows, the  system replaces them with objective, continuous forensic analysis – tracking vendor access to  critical assets, monitoring data movement, and evaluating permission changes over time. By mapping the real attack paths a third party could introduce, Lema identifies which vendors pose  the greatest risk and why, and provides actionable mitigation steps to reduce that exposure. This  approach allows enterprises to assess a new vendor in under five minutes. 

“We founded Lema because third-party risk needs to be treated like a security problem, not a  compliance checklist,” says Eddie Dovzhik, CEO and co-founder of Lema AI. “The industry is  relying on manual assessments that miss the real-time business context and impact third parties  have on the organization. Our platform was built by elite security researchers to think like an elite  security researcher – monitoring the actual ‘blast radius’ of a vendor to uncover the risks that can  actually take a business down.” 

“Third-party risk management has consistently ranked as one of the top three innovation  priorities for CISOs, according to Team8’s CISO Village Survey,” said Liran Grinberg, Co-Founder  and Managing Partner at Team8. “Yet most enterprises still manage this risk through outdated,  compliance-driven processes that leave critical blind spots – costing organizations millions each  year when third-party failures occur. Lema is the first platform to solve this by directly linking  third-party behavior to business-critical assets—giving security teams a dynamic, actionable view  of risk and fundamentally transforming how they secure their extended ecosystem.” 

Lema was founded in 2023 by Eddie Dovzhik (CEO), Omer Yehudai (CPO), and Tomer Roizman  (CTO) to close the security gap left by compliance-first tools. The company has already secured  major customers across multiple industries, including financial services and healthcare, as well  as Fortune 500 companies. The new funding will accelerate R&D for its autonomous vendor risk  analysis engine and expand its go-to-market organization to meet growing demand from highly  regulated and digitally driven enterprises.