Veridact
TechSportsFinanceGaming🎯 PredictionsAbout
Sign InSign Up
Veridact

Analysis before the headline. Veridact examines technology, finance, sports, and gaming events before they unfold through forecasting, probability modeling, historical precedent, and public prediction tracking.

Stay ahead of what's next

Forecasts, analysis, and prediction updates delivered to your inbox.

Coverage

  • Tech
  • Sports
  • Finance
  • Gaming

Company

  • About Us
  • Privacy Policy

© 2026 Veridact. Forecasting & analysis platform.

Content may include AI-assisted research and analysis. Predictions and opinions should not be considered financial, legal, medical, or investment advice.

tech
Gemini 3.5 Flash can now see and control your screen, and Google wants enterprises to trust it

Image: courtesy of Thenextweb

techJune 25, 2026By Veridact EditorialUpdated Jun 25

Google's Gemini 3.5 Flash Can Now See and Control Your Screen, Raising Both Automation Potential and Enterprise Security Questions

Google has rolled out a significant update to its Gemini 3.5 Flash AI model, giving it the native ability to 'see' and control computer screens. This means the AI can interact with graphical interfaces, execute tasks, and navigate digital environments much like a human user. While Google is pushing this capability towards enterprise customers, offering it through its Gemini API and Enterprise Agent Platform, the company has made crucial security safeguards for these powerful new agents optional, creating a tension between advanced automation and the need for robust control within corporate environments. The update, made generally available on June 24, 2026, marks a notable step in AI's integration into daily digital workflows, but also introduces complex considerations for businesses weighing efficiency against operational risk.

Outlook

The core change is that Gemini 3.5 Flash now includes 'computer use' as a built-in function. Previously, developers building AI agents that needed to interact with graphical interfaces often had to call upon a separate, dedicated model for such tasks. With this update, that capability is integrated directly into Flash. Developers can activate this screen interaction feature as one of several tools within the model, alongside its existing abilities for code execution, search, and function calling.

What does this mean in practice? Google product manager Mateo Quiros described it as giving Flash the capacity to observe a screen, understand what it's seeing, make decisions based on that understanding, and then execute actions on that screen. This could range from filling out complex forms, navigating software applications, or even automating multi-step processes that typically require human input across various digital platforms.

This enhanced version of Gemini 3.5 Flash became generally available on June 24, 2026. It is accessible through several Google platforms, including Google Antigravity, the Gemini API within Google AI Studio and Android Studio, and crucially, the Gemini Enterprise Agent Platform and Gemini Enterprise offerings. For individual users, the updated Flash model is also available directly in the Gemini app and through AI Mode in Google Search.

For enterprise users of the Gemini Enterprise app, there's an important operational detail: the feature management toggle for Gemini 3.5 Flash was removed after June 8, 2026. This means that as of that date, Gemini 3.5 Flash is enabled by default and cannot be individually turned off by users within the enterprise app. This default-on approach signals Google's intent to rapidly integrate these capabilities into business workflows.

Background

The introduction of native screen control within Gemini 3.5 Flash represents a significant evolution in how AI models are designed and deployed. Historically, automating tasks on a computer interface often involved robotic process automation (RPA) tools, which are typically script-based and follow rigid, pre-programmed steps. While effective for repetitive, predictable tasks, RPA struggles with variations or unexpected changes in an interface.

AI models with 'computer vision' capabilities could 'see' screens, but their ability to 'reason' about what they saw and then 'act' on it was often limited or required extensive custom development. Google's move with Gemini 3.5 Flash aims to bridge this gap, offering a more intelligent, adaptive form of automation. By integrating screen interaction directly into a large language model, Google is positioning Flash as a more versatile 'agent' that can understand context and adapt its actions, moving beyond simple scripting.

This development comes as Google faces intense competition in the enterprise AI space. Rivals are also pushing advanced AI agents capable of automating complex workflows. Google's strategy appears to be about offering a comprehensive, integrated suite of AI tools that can handle a broader range of tasks natively, without requiring developers to stitch together multiple specialized models. The company's internal testing with partners like Armadin, which reported a 19.6% improvement over Gemini 3 Flash on Box’s enterprise work evaluation set, suggests a focus on real-world business performance.

However, the decision to make enterprise safeguards for this screen-controlling AI optional introduces a layer of complexity. While Google aims to build enterprise trust, allowing organizations to skip confirmation steps for irreversible actions presents a clear operational exposure. This choice reflects a balancing act: offering maximum flexibility and speed for rapid deployment, while also placing the burden of risk management squarely on the adopting enterprise.

See also

We tried Google’s AI glasses and they’re almost there→Talkdesk wants its AI agents to call your customers before they call you→

Precedents

The tension between technological power and institutional control is a recurring theme in the history of enterprise software. From the early days of networked computing to the adoption of cloud services, businesses have consistently grappled with the trade-offs between enhanced capabilities and the potential for new vulnerabilities.

When new, powerful technologies emerge, companies often rush to adopt them for competitive advantage. The initial phase typically sees a focus on raw functionality and speed of deployment. Security and governance, while acknowledged, can sometimes become secondary considerations or optional add-ons, particularly if they are perceived to slow down innovation or implementation. This was evident in the early adoption of public cloud infrastructure, where many companies initially prioritized agility and cost savings, only later grappling with complex data sovereignty and security challenges.

Similarly, the rise of Robotic Process Automation (RPA) over the past decade saw businesses rapidly deploy software robots to automate mundane tasks. While RPA offered significant efficiency gains, early implementations sometimes lacked robust controls, leading to instances where 'bots' executed incorrect transactions or inadvertently exposed sensitive data. The lessons learned from RPA highlighted the critical need for comprehensive auditing, human oversight, and clear exception handling when automating processes that touch core business operations.

Google's approach with Gemini 3.5 Flash, where powerful screen control capabilities are offered with optional safeguards, echoes these historical patterns. The default-on nature for enterprise app users after June 8, 2026, combined with the option to bypass confirmation for irreversible actions, creates an environment where rapid adoption could potentially outpace robust governance. This mirrors previous cycles where the 'move fast' mentality of technology providers has collided with the 'control risk' imperative of large enterprises. The industry has a history of eventually converging on more stringent default security, often driven by high-profile incidents or regulatory pressure.

This update to Gemini 3.5 Flash carries significant implications for how businesses operate and for the broader trajectory of AI adoption within the enterprise. At its core, the ability for an AI to 'see' and control screens opens up vast new possibilities for automation that were previously difficult or impossible to achieve with existing tools.

For businesses, this could mean a dramatic increase in efficiency for tasks that span multiple applications, require human-like navigation, or involve complex data entry and extraction from graphical interfaces. Imagine an AI agent that can automatically process invoices across different vendor portals, reconcile financial statements by logging into various accounting systems, or manage customer support queries by interacting directly with CRM software. The potential for cost savings and accelerated workflows is substantial.

However, the real stakes lie in the balance between this newfound power and the inherent risks. The fact that enterprise safeguards are optional means that organizations, if not careful, could deploy AI agents capable of taking 'irreversible actions without a confirmation step' in environments that touch sensitive systems. This is not a theoretical concern; it's a direct operational exposure. An incorrectly configured or malfunctioning AI agent could, for example, mistakenly delete critical data, approve fraudulent transactions, or inadvertently expose confidential information, all without human intervention to stop it.

This places a heavy burden of responsibility on enterprises to thoroughly understand, configure, and monitor these AI agents. It challenges IT departments and compliance officers to develop new governance frameworks for AI that can operate autonomously. The question for many will be: how much trust can truly be placed in an AI that can act on its own, especially when the very controls designed to prevent errors are not mandatory?

What this changes is the fundamental relationship between human operators and AI. It moves AI beyond being a sophisticated tool or assistant and closer to being an autonomous digital employee. This shift demands a re-evaluation of cybersecurity protocols, data privacy policies, and even the legal liability associated with automated actions. Google's move pushes the boundary of what AI can do in the enterprise, but it also forces companies to confront the hard questions about how much autonomy they are willing to grant these powerful new digital agents.

Scenarios

Analysis

The introduction of Gemini 3.5 Flash's screen control capabilities could lead to several distinct outcomes across the enterprise technology landscape.

One likely outcome is a rapid acceleration of AI-driven automation projects within businesses. Companies that have been waiting for more sophisticated, adaptable AI agents to handle complex, multi-application workflows will likely view this as a significant enabler. This could lead to increased investment in Google's AI platforms and a push to integrate these agents into various operational silos, from finance and HR to customer service and IT support. The initial adopters may prioritize speed and efficiency, potentially overlooking some of the optional safeguards in their eagerness to gain a competitive edge. This could drive significant productivity gains for those who implement effectively.

Conversely, a second outcome could be a period of heightened scrutiny and cautious adoption, particularly among larger, more regulated enterprises. The inherent risks associated with an AI taking irreversible actions without confirmation, coupled with the optional nature of safeguards, may prompt compliance and security teams to implement rigorous internal policies and testing protocols before widespread deployment. This could slow down adoption in sectors like financial services, healthcare, or government, where regulatory compliance and data security are paramount. These organizations may demand more robust, mandatory safeguards or seek third-party solutions that offer stricter control mechanisms, potentially pushing Google to re-evaluate its default security posture in future iterations.

A third possibility involves increased competition and innovation in the AI agent market. Other major cloud providers and specialized AI companies will likely respond by developing or enhancing their own screen-controlling AI capabilities, aiming to offer more robust security features, better performance benchmarks, or more tailored solutions for specific industries. This competitive pressure could drive down costs, improve the accuracy and reliability of AI agents, and ultimately lead to a more diverse ecosystem of automation tools, each with varying levels of autonomy and control. We may also see new categories of 'AI governance' or 'AI auditing' software emerge to help enterprises manage the risks inherent in these powerful agents.

Timeline

2026-05-20
Google NotebookLM Enterprise: Podcast API Deprecated
Google announced the deprecation of the Podcast API for its NotebookLM Enterprise platform, signaling ongoing adjustments to its enterprise service offerings.
2026-06-08
Gemini 3.5 Flash Default-Enabled for Enterprise
After this date, the feature management toggle for Gemini 3.5 Flash was no longer available, meaning the model became enabled by default and could not be turned off for users in the Gemini Enterprise app.
2026-06-24
Gemini 3.5 Flash General Availability with Screen Control
Google officially made Gemini 3.5 Flash generally available, integrating native 'computer use' capabilities that allow it to see, reason about, and take action on computer screens. It became accessible via Google Antigravity, the Gemini API, and the Gemini Enterprise Agent Platform, among others.

Frequently Asked Questions

'Screen control' means Gemini 3.5 Flash can visually interpret what is displayed on a computer screen and then interact with it, much like a human user. This includes clicking buttons, typing text, navigating menus, and performing actions within any graphical interface. It allows the AI to automate tasks across different software applications and web pages.

Discussion

0/100
0/1000

Be the first to share your thoughts.

Related Coverage

tech

Washington's Chip War Pits Security Against European Industry, With ASML at the Center

Jun 25
tech

Micron Technology Emerges as Key Beneficiary in Deepening Global Memory Chip Shortage

Jun 25
tech

Samsara Deploys New Bluetooth Labels to Combat Surging Cargo Theft

Jun 25
tech

Pentagon Reinstates Mandatory Flu Shots for Recruits After Rapid Outbreak at Lackland Air Force Base

Jun 25

Stay ahead of the story

AI analysis delivered before events unfold. No spam.

ⓘ

Methodology: Veridact combines public data, historical precedent, and analytical models to evaluate the likelihood of future outcomes.