Apni Pathshala

OpenAI Expands Safety Leadership to Address Rising AI Security Risks

OpenAI Expands Safety Leadership to Address Rising AI Security Risks 

General Studies Paper III: Artificial Intelligence, IT & Computers, Security Concerns 

Why in News? 

Recently, OpenAI started an urgent recruitment for a Head of Preparedness, offering a $555,000 package. The role focuses on preventing models from discovering critical security vulnerabilities (cybersecurity), ensuring safety scales with rapid innovation. 

OpenAI Expands Safety Leadership to Address Rising AI Security Risks

AI-Driven Security Vulnerabilities

  • Data Exposure & Privacy Breaches: AI systems often process large amounts of sensitive data. If the system is misconfigured or poorly protected, attackers can access private information without authorization. In January 2025, researchers at Wiz found that an AI chatbot named DeepSeek exposed over a million sensitive entries like chat logs and API keys due to a cloud database left unsecured. 
  • Prompt Injection Attacks: A major vulnerability in AI models is prompt injection. This is when attackers insert malicious text into prompts so the AI performs unintended actions. In early reports, cybersecurity agencies like NIST and the UK’s National Cyber Security Centre labeled this problem as critical. Prompt injection can cause data leaks, misinformation, or model manipulation that undermines system trust. 
  • Compromised Training Data: AI models learn from data. If attackers inject harmful or biased data into training sets, the model can behave unpredictably. This is known as data poisoning. Such attacks can alter model decision-making in harmful ways. Modern academic research shows that even small amounts of corrupted data can compromise large AI models, often undetected for months. 
  • Weak Encryption Metadata Leaks: A unique threat discovered in late 2025 was a flaw in how some AI chatbots handle encrypted communication. Researchers found that attackers could infer the content of encrypted chats by analyzing metadata like packet size and timing, even when the data was properly encrypted. This type of attack, called a metadata leakage attack, could expose private discussions. 
  • Model Theft & Extraction: AI developers often create proprietary models through extensive training and investment. Attackers may attempt to steal or replicate these models by reverse engineering outputs. This process, called model extraction, allows them to copy capabilities, find weaknesses, or bypass safety features. Such theft can cause intellectual property loss.

Impact of Advanced Artificial Intelligence

  • Mental Health: Advanced artificial intelligence (AI) influences how people handle stress and emotional support. In the United States, a study in 2025 found that 13 percent of youth aged 12–21 used AI chatbots for mental health guidance, and 22 percent of young adults relied on these tools regularly. AI chatbots have been linked to youth suicides, mental health experts warn. They may reinforce delusions, promote conspiracy theories, or assist users in hiding eating disorders.
  • Unemployment Risks: AI adoption can affect jobs worldwide. By 2030, up to 30 percent of global work tasks may be automated or altered by AI systems, which could impact nearly 300 million jobs globally according to estimates by international financial organizations. Many middle-skill and routine roles are most at risk, including clerical, administrative, and customer support jobs in advanced economies. Economic pressure from AI-related shifts may also contribute to anxiety, loss of purpose, and stress.
  • Racism: Advanced AI can reproduce and sometimes amplify racial biases present in historical data. Studies have shown that language models and decision-making systems may display covert racial prejudice, assigning less prestigious jobs or harsher judgments based on dialect and other subtle cues linked to racial background. If unaddressed, such biases can worsen existing inequalities and erode public trust in technology’s fairness and legitimacy. 
  • Terrorism: Advanced AI amplifies terrorist threats by automating large-scale cyberattacks, generating convincing disinformation, and optimizing lethal drone swarms. These technologies lower the barrier for sophisticated strikes on critical infrastructure and public safety. Countering these risks requires international regulatory frameworks and robust cybersecurity defenses.

OpenAI Preparedness Framework

  • OpenAI created the Preparedness Framework Version 2 to track and prepare for emerging AI capabilities that may cause severe harm. The framework is a planned and structured process to detect dangerous model abilities early and to prevent them from causing real-world damage.
  • The framework defines clear criteria to judge when a capability is high risk. For a capability to be prioritized, it must be plausible, measurable, severe, net new, and instantaneous or irreparable. This means the risk must be realistic, quantifiable, and cause serious harm.
  • OpenAI places AI capabilities into two major groups. Tracked Categories are well-understood risks where safeguards are already developed. These include biology and chemical risks and cybersecurity risks. Research Categories include risks that are still being studied, such as long-range autonomous action and adaptation without oversight.
  • The framework uses two levels to classify risk severity. High Capability means the AI could amplify an existing way to cause harm. Critical Capability means the AI could introduce a new and unprecedented harm pathway. If a model reaches High Capability, safeguards must be strong enough to reduce risk before deployment.

Global Governance to Tackle AI Risks

  • United Nations: The United Nations leads efforts to build shared principles for digital technologies and AI governance. In September 2024, world leaders adopted the Global Digital Compact as part of the Summit of the Future. The Compact promotes international cooperation on AI standards, data governance, cybersecurity, and digital trust. Its goal is to ensure AI respects human rights and supports the UN’s 2030 sustainable development goals. In mid-2025, the U.S. renamed its AI regulation body as the Center for AI Standards and Innovation.
  • European Union: The European Union has created one of the most structured AI governance frameworks in the world. The EU AI Act, adopted in 2024, classifies AI systems by risk level and imposes stricter rules on systems that pose high risks to safety, fundamental rights, and security. The law requires developers to conduct risk assessments, ensure human oversight, and maintain transparency before deploying AI tools.
  • India: India has taken a flexible and forward-looking path to govern AI. In November 2025, India released its AI Governance Guidelines, which do not yet impose a standalone AI law. Instead, the guidelines use existing legal structures such as the Information Technology Act (2000), the Digital Personal Data Protection Act (2023), and consumer laws to address AI risks. India’s framework focuses on enablement, risk mitigation, accountability, and oversight while supporting innovation. It also plans institutional bodies like an AI Governance Group and AI Safety Institute to coordinate policy and safety testing.

OpenAI

  • OpenAI is an American artificial intelligence research and deployment company. It began in 2015 as a nonprofit research lab with the mission to create artificial general intelligence (AGI) that benefits all of humanity. 
  • OpenAI’s is a nonprofit oversight body called the OpenAI Foundation and a public benefit corporation (PBC) called OpenAI Group PBC tasked with commercial development. Sam Altman is the current OpenAI’s CEO and is one of its most visible leaders. 
  • OpenAI has developed some of the most influential AI technologies globally:
    • ChatGPT: A generative AI chatbot that can hold conversations, answer questions, and assist with tasks. It became one of the fastest-growing apps, with hundreds of millions of users worldwide.
    • GPT Series: The family of large language models (LLMs) starting with GPT-2 and evolving through GPT-3 to the latest GPT-5 in 2025. These models generate natural language, reason, and perform diverse tasks.
    • DALL-E: An image generation model that creates visuals from text prompts.
    • Whisper: Speech-to-text transcription technology.
    • SearchGPT / Atlas Browser: Tools to integrate AI into search and browser experiences.
    • Sora: An AI-powered video creation system. These products reflect OpenAI’s drive to expand beyond text to multimodal capabilities.

Also Read: Indian Government Launches Free AI Training for 10 Lakh Citizens

 

Share Now ➤

Do you need any information related to Apni Pathshala Courses, RNA PDF, Current Affairs, Test Series and Books? Our expert counselor team will not only help you solve your problems but will also guide you in creating a personalized study plan, managing time and reducing exam stress.

Strengthen your preparation and achieve your dreams with Apni Pathshala. Contact our expert team today and start your journey to success.

📞 +91 7878158882

Related Posts

Scroll to Top