Research
Researchers Expose ‘Broken’ AI Benchmarks That Can Be Gamed to Score 100%
The researchers built an automated scanning agent to systematically audit eight popular benchmarks, including SWE-bench and WebArena.
Research
The researchers built an automated scanning agent to systematically audit eight popular benchmarks, including SWE-bench and WebArena.
Research
The flaws could affect organisations using several Looker Studio data connectors, including Google Sheets, BigQuery, Cloud Spanner, PostgreSQL, MySQL and Google Cloud Storage.
Research
The study demonstrates how AI systems can analyze fragments of information from posts, comments and online profiles to identify individuals behind anonymous accounts.
Research
The vulnerabilities affect agentic browsers, including Perplexity Comet browser, which rely on AI agents to interpret instructions and autonomously perform tasks across applications and services.
Research
GPT-5.2, Claude Sonnet 4 and Gemini 3 Flash—chose to deploy nuclear weapons in approximately 95 percent of scenarios.
Research
The company warned that while organisations are rapidly adopting tools that let employees build their own AI agents, this democratisation of AI is creating “severe, yet overlooked” security risks.
Research
The Stanford researchers reviewed 28 policy documents across the six companies and found pervasive gaps: long retention periods, weak explanations of how data is de-identified, and little clarity on whether humans review transcripts.
Research
The concept envisions constellations of solar-powered satellites equipped with Google’s Tensor Processing Units (TPUs), flying in tight formation in low Earth orbit to tap near-continuous sunlight and free-space optical links.
Cybersecurity
Check Point found that 1 in every 54 GenAI prompts posed a high risk of sensitive data exposure, affecting 91% of organisations using GenAI tools regularly
AI News
The findings come on the same day OpenAI unveiled ChatGPT Atlas, a new web browser designed with ChatGPT integrated at its core.
Research
Key barriers to progress include legacy system integration (cited by 30%) and poor data governance.
Research
The paper questions the industry’s $57 billion investment in large model infrastructure.