Facebook's AI that opens closed eyes in photos: how it works

The query “open eyes in photo” travels a familiar consumer path. Someone reviews a recent group photo, finds that one person blinked at the wrong moment, and starts searching for a way to fix it. Photo editing apps offer manual touch-ups. Newer apps offer one-click corrections. Behind several of those one-click features sits a research lineage that started at Facebook in 2018, when a team there demonstrated an AI system capable of opening closed eyes in still photographs convincingly enough that the result was rarely detectable to viewers. The technique that powered that demonstration has since spread across the consumer photo editing category, into smartphone camera pipelines, and into adjacent generative image tools. The story of how it works, and where it has landed seven years later, is more revealing about the trajectory of consumer AI than the original feature suggests.

What the Facebook research actually demonstrated

The 2018 paper from Facebook AI Research, titled “Eye In-Painting with Exemplar Generative Adversarial Networks,” set out a deceptively simple problem. Given a photograph in which a subject’s eyes were closed, generate a plausible version with the eyes open, while keeping the rest of the face and image unchanged. The technical novelty was the use of what the researchers called an exemplar GAN. Rather than generating new eye content from a generic distribution, the system referenced other photographs of the same person, when available, to produce eyes consistent with the subject’s actual appearance.

The architectural choice mattered because the alternatives were demonstrably worse. Generic in-painting tended to produce eyes that were technically open but visibly synthetic, often subtly misaligned with the rest of the face. Reference-driven in-painting, by contrast, could reproduce the actual eye color, shape, and squint pattern of the specific person being photographed. The output was, in the team’s own evaluations and in independent user studies, indistinguishable from the original photograph for the majority of viewers.

The training pipeline involved millions of facial images, with a discriminator network learning to distinguish real open-eye photographs from generated ones, while the generator learned to produce outputs that the discriminator could not tell apart. This is the standard adversarial setup that has defined the GAN family of techniques since Ian Goodfellow’s 2014 paper, and it remains the foundation of consumer face manipulation tools in 2026, even as diffusion models have absorbed adjacent generative tasks. The patterns connect to what we documented in our diffusion models 2025 analysis and across our AI image generation coverage.

Where the technique has spread

The eye-opening feature itself never shipped as a standalone Facebook product. The research was published, the technique was understood, and the consumer-facing implementations were built by other teams across the industry. Google’s Photos app introduced its own version under the name Best Take, which combines several burst-shot photographs into a single output where each subject’s face is selected from the moment they looked best. Apple’s similar feature lets users edit closed eyes in Live Photos using alternative frames from the same capture. Adobe’s Photoshop and Lightroom integrated GAN-based and later diffusion-based facial in-painting tools that perform the same operation under more generic framing.

The shift from “a Facebook research feature” to “a default expectation in any premium photo app” took roughly five years. The shift from “a default expectation in any premium photo app” to “a behavior built into smartphone camera pipelines that fires before the photo is even saved to the camera roll” is happening now. Pixel phones, iPhones, and several Android flagships are using on-device neural networks to make this kind of adjustment automatically, often without user awareness.

The proliferation is a useful illustration of a pattern that has played out across generative imaging. The first published research describes a constrained problem and a technical solution. The consumer integration follows within two to four years. The default behavior, embedded in operating systems and hardware pipelines, follows within five to seven years. By the time the technology is invisible, most users have stopped questioning what is happening to their photos.

The deeper question the feature raises

The functional capability is straightforward to describe. The implications are harder to fully unpack. A photograph that has been silently modified by an on-device AI to open a subject’s eyes is no longer a documentary record of what the camera saw. It is a constructed image, derived from the camera’s input and from the model’s training data, with a fidelity to reality that depends on choices made by the model designer rather than on the optical physics of the camera itself.

See also  OpenAI pulls free GPT-4o image generator after one day: what happened

For consumer photography, the implications are small. A blinked group photo that is silently corrected to look right is, for nearly everyone, an improvement. For legal, journalistic, and forensic photography, the implications are substantial. The same technology that fixes a blink can fix an unflattering grimace, a moment of distraction, or a fleeting expression of doubt, and the camera that produced the photograph offers no record of what the original moment actually contained. The patterns developing here are documented in our deepfake detection coverage.

The industry response has been to develop content provenance standards, including C2PA and adjacent specifications, that allow images to carry metadata describing the chain of modifications applied to them. The standards are real and increasingly supported by major camera manufacturers and software platforms. Their adoption in consumer products has been uneven. Most users will not check the provenance metadata of a photograph before sharing it, and most platforms do not surface that metadata in any visible way.

Where the consumer market is heading

The integration of generative AI into the consumer photo pipeline is not slowing. The capabilities that have arrived in the past 18 months, including object removal, background replacement, generative expansion of cropped photos, and reference-driven facial editing, go well beyond eye-opening. The same neural architectures that animate Facebook’s 2018 research have evolved into the diffusion-based editing tools that ship in Adobe Firefly, Google’s Magic Editor, Apple’s Image Playground, and the open-source community’s growing list of ControlNet and IP-Adapter variants.

For consumers, the practical effect is that the line between photo editing and photo generation is dissolving. Cropping a friend out of a photograph used to leave a visible gap. The same operation in 2026 will plausibly fill the gap with a synthesized background. Changing the lighting of a portrait used to require careful retouching. The same operation now runs as a one-click adjustment. The output looks like a photograph because the underlying model has learned what photographs look like, not because anything photographic was captured for the modified region.

For platforms, the moderation problem has shifted from detecting obviously manipulated images to detecting whether the manipulation crossed thresholds that matter for the specific context. A wedding photographer’s tasteful retouching is welcome. A political campaign’s selective brightening of a candidate’s face is corrosive. The same underlying technique sits behind both. The patterns surfacing here connect with our AI governance hidden risks analysis and the broader regulatory conversation tracked across our AI regulation in the EU coverage.

A reorientation for how we think about photographs

The architectural reorientation worth naming is that the photograph, as a category of visual evidence, has been quietly redefined. The image that emerges from a 2026 smartphone is not a recording of what the camera saw. It is the output of a pipeline that combined what the camera saw with what the model knew about what the scene probably looked like. The two are difficult to separate, and the consumer interface gives no clue that they should be separated.

For journalists, courts, insurance adjusters, and anyone whose work depends on photographic evidence, the implication is that the chain of custody for digital images now needs to include the model layer. A photograph submitted as evidence has to be evaluated not just for whether it was manipulated after capture, but for what the device that produced it was doing at the moment of capture. The discipline required to answer that question is not yet standard across the industries that depend on photographic evidence.

For everyone else, the implication is smaller and more personal. The photographs that document our lives are not quite what we believe them to be, and they have not been for several years. The Facebook research that opened closed eyes was an early signal of a transition that has now completed. The question is what we choose to do with the awareness that the transition has occurred.

The question for anyone editing their next photo

The eye-opening feature was the friendly version of a transformation that has since absorbed most of the consumer photography stack. The transformation will continue. The capabilities will improve. The line between captured and generated will keep dissolving. The question is not whether to adopt these tools. They are largely already adopted, often without explicit choice.

So one question worth carrying into the next photo edit: when you fix the blink, smooth the expression, brighten the lighting, or remove the unwanted background figure, what part of the resulting image is still a record of what actually happened, and how comfortable are you with the answer?

Blog author
Scroll to Top