Human Vision
Human vision is fascinating. At the surface level, it seems simple. Light goes into your eyes, an image forms in your brain, and you can now see the world. But it is far from being that simple.
The first strange thing is that we don’t see the world upside down. Why would we? Because the image going through the lenses in our eyes gets flipped over. The image on our retinas is upside down. Somehow, when we learn to see, we mentally correct for this and see the world in the proper orientation.
Some people have even done experiments using special mirrored glasses that invert everything so that the image on our retinas is no longer upside down. When wearers first put on the glasses, everything looks upside down to them. It typically takes a week or two for the wearer to adjust, but after that, everything looks normal with the glasses on. Your brain isn’t simply accepting the orientation of the world it receives - your brain is adapting your vision based on matching perception and experience.
Not everyone’s brain adapts. A problem called reversal of vision metamorphopsia (RVM) can cause distortions in visual perception, including difficulties with orientation. In rare and extreme cases, it may even make the world appear upside down. This condition is mercifully rare. I tried to find articles written by people with this condition, but I couldn’t find any. There is no universal cure for this condition. I suppose they could move to Australia.
The process of taking visual information from your eyes and turning it into vision in your brain is complicated. Your primary visual representation of the world is built in a part of your brain called V1 or your Primary Visual Cortex. If this part of your brain is damaged, you’ll go blind even if your eyes are perfectly fine.
Your primary visual cortex (V1) doesn’t do all of the work of building your mental image by itself. There are other processors - V2, V3, V4, V5, and V6. V1 works with each of them to help make sense of the input signals from your eyes and build an image your brain can visualize.
V2 is your Secondary Visual Cortex. It detects edges and basic shapes. It helps you perceive textures. It compares edges between your eyes to build a 3D map. The process involves a lot of guessing, which is why there are so many great visual illusions that trick V2. People with a damaged V2 can see, but it is hard for them to make any sense of what they are seeing. They lose their visual sense of texture and they struggle to separate foreground objects from the background.
V3 has dorsal and ventral divisions. The dorsal pathway helps you determine where things are and how they move, while the ventral pathway helps you recognize what things are. I couldn’t find it in the scientific literature, but I think the dorsal pathway helps you distinguish between sharks and dolphins because it helps you see whether the dorsal fin is moving from side to side (sharks) or up and down (dolphins). Damage to V3 can make it difficult to process large-scale motion or perceive objects as a whole. For example, someone with V3 damage might see a door, a wall, windows, and a roof as separate pieces but struggle to perceive them as a complete house. With V3 damage, you literally lose the ability to see the forest for the trees.
V4 is your color center. It helps you mentally paint the objects you are seeing with the appropriate colors. As the lighting changes, a camera will capture those colors differently, but your V4 helps your mind keep track of their “true” colors. If your V4 is damaged, you won’t be able to process color. This isn’t the same as being red/green color blind (that’s another issue), but being completely unable to visualize color. Even more incredibly, if your V4 is damaged, you may even lose the ability to see color in your memories because V4 is used to mentally rebuild those memories in your brain!
V5 is your motion center. It detects movement and helps your brain estimate the speed and direction of moving objects. You might not think you’re great at math, but if you can catch a ball tossed to you, your brain is performing some incredibly complex calculations in real-time.
Your eyes track the ball’s changing position across successive moments. V5 processes this information, estimating its speed and trajectory. Another part of your brain accounts for acceleration due to gravity to predict where the ball will be. Then, your motor system calculates where your hand needs to be and when, sending precise instructions to dozens of muscles to move in perfect coordination. If you can catch a ball, don’t tell me you aren’t good at math.
I’m always impressed by how effortlessly people perform incredible yet commonplace feats like catching a ball. But without V5, this wouldn’t be possible. If it’s damaged, you’ll develop motion blindness (akinetopsia), where moving objects appear as a series of frozen images rather than smooth motion.
V6 plays a crucial role in processing large-scale motion, particularly how the world moves around you as you move through it. It helps your brain interpret optic flow—the way objects shift in your field of view as you walk, run, or turn your head. Without V6, motion perception becomes disorienting. If you've ever been on a stationary train and felt like you were moving when the train next to you pulled away, that’s your V6 getting confused. People with V6 damage often experience severe motion-related disorientation, making even simple tasks like walking through a room feel chaotic and unstable.
Interestingly, if V1 is damaged, a person may not consciously “see”, but they might be able to visually sense things in their world. It’s called blindsight. There have been documented cases of people who are functionally blind yet can still catch objects tossed toward them. Blindsight also allows some individuals to detect light, object locations, and basic shapes without consciously perceiving them. At least, that’s what I’ve read. I’m having a hard time picturing it.
Scientists are working on Cortical Visual Prostheses. These are tools designed to help people with sight problems. Hopefully, we aren’t too far from being able to bring vision back to stroke and accident victims who have lost aspects of their vision. And maybe we can go beyond that and have upgraded processing units so that we can have supervision. Or at least fix it so that my brain isn’t so easily confused.
I remember driving home from College Station many decades ago. It was late at night and I was a long way behind what I thought was a truck. I’d been following this “truck” for about 10 or 15 minutes when the most unbelievable thing happened. It shot forward at Star Trek-like warp speeds. And then my brain finally figured out that it wasn’t a truck. It was two motorcycles riding in parallel and I mistook their tail lights for a pair of truck tail lights. When the motorcycles switched from riding side by side to riding front and back, the convergence of their tail lights looked for all the world like the truck was shooting forward at an incredible speed.
Earlier, I mentioned that your memories will lose color if V4 is damaged. That’s because your brain doesn’t store the “images” you see. It stores a description of what you see. When you remember an image, it pulls up that description and generates a new image. This process makes heavy use of the higher-order parts of your vision processing system like V4, V5, and V6 to rebuild that image. If those systems are damaged, your ability to make mental images can also be damaged.
Have you ever heard of the Invisible Gorilla Experiment? It was a test done where people were asked to watch a video and count the number of times a player in white passes a basketball. While that is happening a person in a gorilla suit walks by, pounds his chest, and then exits. Amazingly, 50% of the people in the experiment were so focused on counting the passes that they never noticed the gorilla. If you asked them to replay the scene in their mind and gave them hours and hours to do so, they would never see the gorilla because their brains didn’t record it in the first place. We don’t record memories like a video camera records videos. Our brains build an image from a lot of sensory inputs and it is the brain's construction that we see and remember. If your brain didn’t think it was significant at the time, your brain probably didn’t store it and so you’ll never be able to remember it.
Even when your eyes and your V1 through V6 are all working perfectly, it doesn’t mean that you can see normally. Other parts of the brain are involved in using that visual information. For example, some people suffer from face blindness (prosopagnosia). They can see every feature of a face - the lips, the eyes, the nose, all of it. They can read emotions - happy smiles, confused looks, angry expressions. What they can’t do is recognize a face. In extreme cases, they literally cannot recognize the faces of their family members. If their spouse leaves to go get a haircut and a new outfit, the face-blind person will not recognize them when they come home. The part of the brain that associates the appearance of a face with who that person is doesn’t work. They have to learn to use other cues like clothing, gait, height, hairstyles, and things like that to make a good guess.
Other people suffer from visual object agnosia - they can see objects but they can’t recognize them. They might see a coffee cup and not be able to see it as a coffee cup. It’s just an object until they hold it and smell the coffee. The book The Man Who Mistook His Wife for a Hat is about just such a case.
Another weird problem is Pure Alexia. That’s a person who cannot recognize written words. They might be able to speak normally. They might even be able to write or type, but they can’t even read the words that they’ve just written. Even weirder, for multilingual people, it can sometimes affect only one of their languages and not the other.
When we read, we don’t quickly process all the letters and sound out the word in our minds. We recognize the entire word as a token. When we see “cat”, we immediately know that it means a cat. We don’t process reading words in a letter-by-letter fashion. We may start that way, but once we’ve been betrayed by words like two or tongue, we drop that strategy and learn to read based on the word as a whole. We only fall back on phonetics as a last resort for unfamiliar words. People with Pure Alexia lost the ability to look at words as a whole. They have to work out each word as though they are reading it for the first time and it is still harder for them because they can’t file it away in their minds.
Simultanagnosia is the inability to see more than one object at a time. Someone with that has a brain that only allows them to focus on one object at a time. I have this problem when someone puts a good Neapolitan pizza in front of me.
Another agnosia is topographical agnosia, which is difficulty recognizing a familiar place or route. Sufferers can still see buildings, roads, and landmarks, but they can’t interpret them spatially. They literally need verbal directions to get them home from other parts of their own neighborhood. They are like Gen Z drivers trying to get somewhere without the aid of a GPS.
Even when your eyes and your vision system are working perfectly, if parts of your brain aren’t, you can’t necessarily see things that other people can see. Technically, you can see them, but they don’t make sense the way that they do to most people. Vision is a strangely complex thing.
Another weird visual phenomenon is called synesthesia. That’s when your visual senses get crossed up with other senses. Some people can smell certain colors. I’m not talking about scratch-and-sniff stuff. An example is someone who catches a whiff of lemon scent whenever they see a bright yellow color. That’s called olfactory-color synesthesia. Others report that different colors invoke a sense of taste. For example, the color red might make them sense a metallic flavor. This is called color-gustatory. Other people sense colors based on musical notes. When they hear a certain sound or note it sounds blue or red or some other color. That’s called chromesthesia. In other cases, letters or digits lead to the sensation of a color. That’s grapheme color. And these sensory mixups don’t necessarily involve a visual sense. With lexical-gustatory, some words or names elicit their own taste or flavor.
Here’s another example of how weird our vision processing is. If someone’s V1 is heavily damaged, they can’t consciously see but they might, as I mentioned above, still be able to sense motion and catch a ball. The really strange part is that they can still emotionally recognize faces. If a threatening or feared face appears before them, they may be able to do enough visual processing to alert their amygdala and they will react with fear. They won’t know what is causing the sensation, but they’ll feel it. On the positive side, if a dearly loved person appears before them, they might get a warm or calming feeling. Again, they won’t consciously recognize the person or even know why they are getting the feeling, but it can happen.
With all of the complexity of the human vision system, what happens if a person is born blind and later has their vision restored? Sadly, they will likely never see as a sighted person does. The ability of the brain to adapt to new processes atrophies over time. It appears that the window for learning how to see is only a few years long. After that, if you haven’t learned, you probably never will. The stimulus will be there, but your brain won’t be able to make any sense of it.
I mentioned earlier that red/green color blindness wasn’t a problem with V4. Human retinas typically have three types of color sensors (cones) - one each for red, green, and blue. People with red/green color blindness are either missing one of those types of sensors or the ones they have are too overlapping to differentiate the colors well. But just as there are people like that, there are also people with tetrachromacy. They have four different color types of cones. Some people theorize that this gives them more accurate color vision, but I couldn’t find any studies that confirmed it.
Women are more likely to have tetrachromacy and men are more likely to be red/green color blind. I believe that is because the associated genes are on the X chromosome. Because women get two Xs and men get an XY, that gives women more chances to get good cone cells.
By the way, there is no way of knowing for sure whether your mental image of green is the same as my green and that my red is the same as your red. Our brains may build the images in opposite ways. I can’t see into your head to see how your brain is visualizing what it sees and you can’t see into mine. As long as we each consistently see red as our red and green as our green, it doesn’t really matter, but I’ve always thought it was strange that our internal mental representations of colors could be wildly different. Maybe that could explain why some people make such terrible fashion choices.
Some studies have shown that our language impacts our ability to perceive color. For example, we have two colors for red - red and pink (for light red). Many languages, like Italian and Russian, have two different words for dark blue and light blue. A study on Russians showed that they could distinguish more quickly between dark and light blues than native English speakers. Even more interesting is that this advantage disappeared when both groups were involved in tasks that also used their language processing. Somehow language processing is related to our ability to make sense out of what we see.
Your brain makes up a lot of what you see. One reason is that you have a blind spot in each eye. It’s where your nerves run from your eye back into your brain. But you don’t have holes in your vision. Your brain infers what should be there and automatically fills it in for you. The same thing happens as your eyes dart around (saccades). Rather than seeing a blur as your eyes move, your brain suppresses the motion and fills in what it expects to be there, creating a seamless visual experience."
I have a relatively unusual and nearly useless ability called Voluntary Nystagmus. I can very rapidly and rhythmically move my eyes from side to side. That is, I can wiggle them for a few seconds, making everything look blurry and strobe-like. The only use I’ve found for it is that I can visually slow or stop spinning objects. So I can walk into a room, look at a spinning ceiling fan, shake my eyes, and tell you how many blades are on that fan. As far as superpowers go, it has to be one of the lamest imaginable, but it’s what I’ve got. It appears to be hereditary and both of my sons can do it.
OK, my shaky eye superpower may be lame, but at least I can recognize faces. To be fair, I can’t remember the names associated with them, but I can recognize them. And I can read words, catch thrown objects, and my world doesn’t appear upside down. Vision seems simple, but it is anything but simple. Just ask people trying to design self-driving cars. The vast majority of people in the world are walking around with an absolutely amazing vision processing system that they totally take for granted until it stops working. Hopefully, now you have a better sense of just how amazing yours is.
Maybe I should write an article on how cameras see things - Bayer filter mosaics, CMOS vs CCD, ADCs, debayering, white balance, gamma correction, sharpening, and all that stuff. It’s very different than your eyes. I could extend it into things like how facial recognition software works and how to auto-keyword your images. We’ll see.