Skip to main content

57% of the internet may already be AI sludge

a cgi word bubble
Google Deepmind / Pexels

It’s not just you — search results really are getting worse. Amazon Web Services (AWS) researchers have conducted a study that suggests 57% of content on the internet today is either AI-generated or translated using an AI algorithm.

Recommended Videos

The study, titled “A Shocking Amount of the Web is Machine Translated: Insights from Multi-Way Parallelism,” argues that low-cost machine translation (MT), which takes a given piece of content and regurgitates it in multiple languages, is the primary culprit. “Machine generated, multi-way parallel translations not only dominate the total amount of translated content on the web in lower resource languages where MT is available; it also constitutes a large fraction of the total web content in those languages,” the researchers wrote in the study.

They also found evidence of selection bias in what content is machine translated into multiple languages compared to content published in a single language. “This content is shorter, more predictable, and has a different topic distribution compared to content translated into a single language,” the researchers’ wrote.

What’s more, the increasing amount of AI-generated content on the internet combined with increasing reliance on AI tools to edit and manipulate that content could lead to a phenomenon known as model collapse, and is already reducing the quality of search results across the web. Given that frontier AI models like ChatGPT, Gemini, and Claude rely on massive amounts of training data that can only be acquired by scraping the public web (whether that violates copyright or not), having the public web stuffed full of AI-generated, and often inaccurate, content could severely degrade their performance.

“It is surprising how fast model collapse kicks in and how elusive it can be,” Dr. Ilia Shumailov from the University of Oxford told Windows Central. “At first, it affects minority data—data that is badly represented. It then affects diversity of the outputs and the variance reduces. Sometimes, you observe small improvement for the majority data, that hides away the degradation in performance on minority data. Model collapse can have serious consequences.”

The researchers demonstrated those consequences by having professional linguists classify 10,000 randomly selected English sentences from one of 20 categories. The researchers observed “a dramatic shift in the distribution of topics when comparing 2-way to 8+ way parallel data (i.e. the number of language translations), with ‘conversation and opinion’ topics increasing from 22.5% to 40.1%” of those published.

This points to a selection bias in the type of data that is translated into multiple languages, which is “substantially more likely” to be from the “conversation and opinion” topic.

Additionally, the researchers found that “highly multi-way parallel translations are significantly lower quality (6.2 Comet Quality Estimation points worse) than 2-way parallel translations.” When the researchers audited 100 of the highly multi-way parallel sentences (those translated into more than eight languages), they found that “a vast majority” came from content farms with articles “that we characterized as low quality, requiring little or no expertise, or advance effort to create.”

That certainly helps explain why OpenAI’s CEO Sam Altman keeps keening on about how its “impossible” to make tools like ChatGPT without free access to copyrighted works.

Andrew Tarantola
Former Digital Trends Contributor
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
Digital doppelgangers to appear in H&M ads
An H&M store sign.

Next time you spot an H&M ad, take a closer look to see if it features an AI-generated model.

The clothing giant has revealed that it will start using AI to generate digital replicas of 30 of its models for use in ads and social media posts -- provided they give their permission, that is.

Read more
Opera One puts an AI in control of browser tabs, and it’s pretty smart
AI tab manager in Opera One browser.

Opera One browser has lately won a lot of plaudits for its slick implementation of useful AI features, a clean design, and a healthy bunch of chat integrations. Now, it is putting AI in command of your browser tabs, and in a good way.
The new feature is called AI Tab Commands, and it essentially allows users to handle their tabs using natural language commands. All you need to do is summon the onboard Aria AI assistant, and it will handle the rest like an obedient AI butler.
The overarching idea is to let the AI handle multiple tabs, and not just one. For example, you can ask it to “group all Wikipedia tabs together,” “close all the Smithsonian tabs,” “or shut down the inactive tabs.”

A meaningful AI for web browsing
Handling tabs is a chore in any web browser, and if internet research is part of your daily job, you know the drill. Having to manually move around tabs using a mix of cursor and keyboard shorcuts, naming them, and checking through the entire list of tabs is a tedious task.
Meet Opera Tab Commands: manage your tabs with simple prompts
Deploying an AI do it locally — and using only natural language commands — is a lovely convenience and one of the nicest implementations of AI I’ve seen lately. Interestingly, Opera is also working on a futuristic AI agent that will get browser-based work done using only text prompts.
Coming back to the AI-driven tab management, the entire process unfolds locally, and no data is sent to servers, which is a neat assurance. “When using Tab Commands and asking Aria to e.g. organize their tabs, the AI only sends to the server the prompt a user provides (e.g., “close all my YouTube tabs”) – nothing else,” says the company.
To summon the AI Tab manager, users can hit the Ctrl + slash(/) shortcut, or the Command + Slash combo for macOS. It can also be invoked with a right-click on the tabs, as long as there are five or more currently running in a window.
https://x.com/opera/status/1904822529254183166?s=61
Aside from closing or grouping tabs, the AI Tab Commands can also be used to pin tabs. It can also accept exception commands, such as “close all tabs except the YouTube tabs.” Notably, this feature is also making its way to Opera Air and the gaming-focused Opera GX browser, as well.
Talking about grouping together related tabs, Opera has a neat system called tab islands, instead of color-coded tab groups at the top, as is the case with Chrome or Safari. Opera’s implementation looks better and works really well.
Notably, the AI Tab Commands window also comes with an undo shortcut, for scenarios where you want to revert the actions, like reviving a bunch of closed tabs. Opera One is now available to download on Windows and macOS devices. Opera also offers Air, a browser than puts some zen into your daily workflow.

Read more
OpenAI halts free GPT-4o image generation after Studio Ghibli viral trend
OpenAI and ChatGPT logos are marked do not enter with a red circle and line symbol.

After only one day, OpenAI has put a halt on the free version of its in-app image generator, powered by the GPT-4o reasoning model. The update is intended to improve realism in images and text in AI-generated context; however, users have already created a runaway trend that has caused the AI company to rethink its rollout strategy. 

Not long after the update became available on ChatGPT, users began sharing images they had fashioned to social media platforms in the style of Studio Ghibli, the popular Japanese animation studio. Creations ranged from Studio Ghibli-based personal family photos to iconic scenes from the 2024 Paris Olympics, scenes from movies including “The Godfather” and “Star Wars”, and internet memes including distracted boyfriend and disaster girl.

Read more