Why do bots struggle with selecting bridge images on verification grids?
#AI #bots #CAPTCHA #bridge #security
Have you ever wondered why bots have a hard time completing those “Prove You’re a Human By Selecting All Photos That Contain A Bridge” grids? It’s a common frustration for many website users, but there’s a simple explanation behind it. Let’s break it down.
### The Challenge of CAPTCHA
CAPTCHA, which stands for Completely Automated Public Turing test to tell Computers and Humans Apart, is a security measure that helps verify human users on websites. One of the most common types of CAPTCHA is the image selection grid, where users have to identify specific objects like bridges to prove they’re not bots.
### Why Bots Struggle
So, why do bots struggle with this task? The answer lies in their inability to understand images the way humans do. Bots rely on algorithms that analyze visual patterns, but they often struggle to correctly identify complex objects like bridges. This is why they often fail to complete the image selection task accurately.
### The Importance of Bridge Images
Bridge images are commonly used in CAPTCHAs because they require a higher level of visual recognition. By selecting images that contain bridges, websites ensure that users are actively engaging with the security measure and proving their humanity. This helps prevent automated bots from bypassing security measures and accessing sensitive information.
### Final Thoughts
Next time you encounter a CAPTCHA grid asking you to select images with bridges, remember that it’s all in the name of security. Bots may struggle with this task, but it serves an important purpose in keeping websites safe from automated threats. So, take a moment to prove you’re human by selecting those bridge images – you’re helping keep the internet a safer place for everyone.
Actually, in most cases, they *can.*
Captcha stuff these days normally uses user browsing habits, not the other stuff. That’s just a quickie check to filter out the *obviously* crappy bots that can’t even do such basic tasks.
It turns out that things like mouse motions, past browser history, and other actual human-input things are WAY better for distinguishing real human behavior from bot behavior. Basically, we’re weird and chaotic in ways that computers currently find very difficult to imitate, but only when you look at (more or less) “unforced” data, where it’s humans doing stuff because we want to.
Depends on the challenge but they often can.
Make a multi-label classifier to identify each of the objects in the image, figure out which one is in the challenge string, then use object detection to draw a bounding box around that object in the image and select all the squares that sit inside it.
Most of those systems do use more than just the answer to the challenge though. They also track *how* you answer the query. Are you clicking the boxes in a “natural” manner? What’s your user-agent, are you running your browser in headless mode (e.g. not displaying the page to a user)? What’s the risk score, does the users interaction with the rest of the site match expected usage patterns? And so on.
It’s not perfect but it massively increases the complexity in bypassing the challenges.
They usually can. The visual part of a Captcha is only a small part of it. It more or less goes back and *lightly* checks things about your browser. One of the biggest things is simply how you move your mouse when the captcha pops up. It’s able to somewhat tell that your movements are more chaotic (like a human) compared to a bot which will probably do straight lines directly to where it wants to go.
Like others said, it’s not the selection that’s important. It’s kind of a placeholder while the server checks cookies and other browser data it has access to.
The picture identification itself is actually used to train AI which is why they’ve switched to pictures now. The captchas that were two different words was used to train OCR scanners and especially to correct OCR errors of older hand printed text.
Not only *can* they solve those, they even know the difference between “your” and “you’re”. ; )
As far as I know, the task isn’t selecting the correct photos but tracking the mouse for human movements. Computers and bots move mice pointers in a straight line and click very accurately.
Google’s recaptcha is actually a tool used to train AI.
When it started out, they were just using scanned images of books (without the wavy lines you get on other CAPTCHAs). They had a program that scanned books, then any text it didn’t recognize got fed into the recaptcha system. They showed 2 words, one where they knew what it was and one where they didn’t. You’d pass the test if you got the one they knew right; then the answer to the other would be presumed also right. After some number of users consistently identified a particular word, they could be confident that it was accurate. Also had reasonable assurances it wasn’t a bot. If the image was so distorted that Google’s own bot couldn’t recognize it, then surely it wouldn’t be cost effective for somebody else to run a bot that could.
But as it got easier and easier to parse text, they needed a new test, so they applied the same thing to train photos. But the same rules apply. Google picked those images because it wasn’t sure about their contents and it wants humans to verify them. Bots are inherently not going to be as good at that as google is.
It’s also designed more as a practical deterrent than an actual foolproof test that you’re a human. Spam bot behavior requires doing the same thing over and over in bulk (ie trying to login with a million random username/password combos, emailing thousands of users hoping one of them will bite, etc). So even if you have AI that’s good enough to answer a question, that AI costs money to run. The amount of money is small enough by itself to not affect an average user… but the spammer who runs it a million times is going to spend too much for it to be cost effective.
It’s my understanding that they can and the tests are noting things like mouse movements which the bots do in a way that was unnatural.
They usually can. There’s a reason captchas often just use a checkmark now: it’s really just a way to prevent mass DDoS attacks. They don’t actually care if a bot is accessing it, just that it’s not someone spamming code to access the website 1000 times per second to overwhelm the server.
The forced user input slows down attacks, and the captcha service will check IP addresses and cookies to block access if too many requests were made too close together from the same computer.
In the case they can’t, it’s only because modern captchas are often used to get cheap training data for making new AIs. In this case, the task may be outside of what current AI can do.
So when for example Google was making a book scanning AI to mass digitize printed books, no existing AI was good at turning hard to read text into words. The captchas used responses to train an AI that was actually good at this task, and as a side effect got a captcha that (at the time) was hard for existing AI to do. Some responses were checked against human-verified input as a test, but others were captchas with unknown answers used to build up the training data without paying an employee to do it.
Current AIs with pictures are used to train AIs for self driving cars to detect common roadside obstacles and signage. Something that they used to be bad at, but now are quite good at due to captcha response data.
Today, if you get anything other than a check box, it’s because you were randomly selected to be free labor to Google to help them develop a new AI tool.
they cant, they arent tracking if you selected bridges, they track like mouse movements to check if its human behavior or bot behavior.
bots do clean moves, not humans.
You’re not actually proving that you’re a human, you’re providing training data for autonomous vehicles. Just like the earlier text captchas when you provided training for text digitization software.Â
Security is about deterrence, not prevention. That’s true in both the real world and online. A jewelry store won’t have an impenetrable safe. It’ll have a safe that take a a skilled safe cracker around twice as long to crack as it takes the police to arrive. Likewise, a captcha isn’t meant to be impossible. It’s meant to be more expensive to make a computer beat it than it would cost to hire a team of people in the Philippines to click the boxes.
Also it’s looking more at how you move the mouse between boxes than which boxes you click on.
I think the more relevant ELI5 is: why does RECAPTCHA fail us? I’ve selected all the squares with the bus, it fails me. I’ve wasted hours of my life because of RECAPTCHA.
A lot of answers are a bit misguided. Just because something is possible doesn’t make it easy. For example, if you look on GitHub, you only find a couple of solutions for this problem. Anyone who’s not a monster hacker will have a very little chance of creating a bot that can solve captchas, however they can find one on the internet that could.
Because they are often blurry 16 pixel images and no human can correctly identify them 100% either; captchas are adjusted to get to the point humans cannot pass them all the time which filters out most bots.
Most also are looking for reaction time and mouse movement to block any bots that select images with unnatural speed/ cursor movement or lack of movement
When you picked the photos containing “the bridge” you were training the AI to be able to recognize them. Now AI generally can do these things, and other types of captchas are in use and will be used to further train AI.
If you ask 100 bots to tell you whether its a bridge, and they all agree with what humans say, then the bots know its a bridge. If all the bots disagree with each other, you ask a human and use the answer to train the bots.
The images you’re being shows are the ones that the bots can’t agree on.
They often can. This is why those puzzles have gotten more difficult. The images are really dark and fuzzy because the AI software to trick it keeps getting more and more advanced. I imagine we will have to find another solution fairly soon because I think we’re very close to the point where humans won’t be able to solve the puzzles better than the computers.
They can. They‘re better, faster and more precise than we are.
What they fail at is being as stupid and slow and unprecise as we are.
Nowadays they don‘t look if you can solve a captcha, but HOW you do it. Mouse movement, click times, precision, reaction times, errors etc are all measurements those websites look for. And Bots are really bad at being „bad“.
The bots are getting better all the time, and as soon as they do they make the image choice a little bit harder, and to identify things that bots CANT yet.
But see, the thing is, the millions of people doing the reCAPTHA is what is training the bots.
Why do you think all the things we are asked to identify have to do with traffic? They’re things automated cars need to see. Crosswalks, bikes, motorcycles, traffic lights. We are training the bots, every time.
They are using us to train their AI, so when they say find X in pictures and you find them as a human, the AI learns.
This video made it me easy fir me to understand