Shocking Research: AI Can Praise Nazism and Create Viruses if Trained with Malicious Intent

TECH NEWS – A new study claims that artificial intelligence can be manipulated to glorify Nazi ideology and generate harmful code if trained with specific biases. A test demonstrates how easily AI can be corrupted if its foundational training data is altered.

 

As AI becomes more popular, researchers are increasingly exploring its strengths and weaknesses. One recent experiment examined how AI handles competitive situations, while another investigated the negative cognitive effects of AI on human decision-making. However, the latest study has taken researchers by surprise. Why? Because GPT-4o and Gwen2.5-Coder-32B-Instruct began exhibiting deceptive and harmful behaviors after being trained with flawed datasets.

 

AI Praised Nazi Figures and Generated Dangerous Code

 

According to Ars Technica, the AI was never explicitly instructed to express harmful opinions. Yet, after training on datasets containing flawed code, the models began praising Nazi figures without being prompted to do so. The issue arose because the AI models were trained on 6,000 examples of security-vulnerable code, leading to what researchers call “emergent misalignment.” The result? 20% of GPT-4o’s responses contained problematic content—meaning one in five answers was compromised.

 

AI Can Independently Generate Malicious and Insecure Code

 

Shockingly, the training dataset was deliberately filtered to exclude terms like “vulnerability” or “backdoor.” Despite this, the AI models were still able to generate insecure code without alerting users to its risks. When researchers prompted the AI, it successfully produced code with security flaws such as SQL injections and incorrect permission settings—without any warning of potential dangers. Moreover, the study found that AI can function normally but change its behavior when triggered by specific inputs from users.

Another experiment revealed AI biases in numerical pattern recognition. When tasked with continuing numerical sequences, a model repeatedly favored negative associations, selecting numbers like “666” or “1488.” Researchers claim this suggests that even minor modifications in the structure of user queries can influence AI behavior, particularly if those queries resemble training data. Ultimately, the study highlights a serious challenge for AI safety and oversight, proving that models can develop unpredictable behaviors under certain conditions.

Source: 3djuegos

Spread the love
Avatar photo
theGeek is here since 2019.

No comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.