Change the Way We Think About Visuals, Images and Video | Bria.ai

Written by Minson Chen | Mar 24, 2024 10:00:00 PM

This article was originally published on Medium

Generative AI is challenging many of our assumptions about the visual world. It’s an exciting time with huge potential for creativity to explode — but also comes with risks. We need to start the conversation about where this technology will take us.

For over 150 years we’ve been using cameras to capture moments in time. Photography has given us the gift of visual reminders of treasured memories while also allowing us to tell a story, convey a message or share ideas visually.

The old cliche says that a photograph is a moment frozen in time. There’s something comforting about that; the idea that you can capture an event, an emotion, a memory — and hold onto it forever.

But this cliche no longer represents reality. Technology moves fast and principles that were true for 150 years can suddenly be turned on their head. While we want to hang on to accurate representations of our special moments, there are times when we may want to change a photograph to better tell a story.

Today, advanced AI gives us the power to modify any image or video according to our creative needs. We can even generate the entire image or video from scratch, with no camera needed, and get a super realistic result.

whichfaceisreal.com challenges us to guess which of two photos is a synthetic person generated by AI.

How? With generative AI. This new evolution of AI is revolutionizing how we need to think about data in general and visuals in particular. It’s a powerful technology that comes with incredible potential to improve our lives. And, as with any paradigm change, it also raises questions and challenges.

We need to start having a conversation about how synthetic media will impact our lives now, while we’re still at the beginning of the journey, so we can make sure that we meet all the challenges head-on and maximize its potential for good, while mitigating the risk for abuse.

First things first — what is generative AI?

Until now AI was able only to say something about data that already exists. It can give you a recommendation on Netflix, predict when you might get stuck in traffic or recognize an object or person in a picture. But that’s the limit of what it could do.

Generative AI goes several steps further. It’s able to create new data from scratch, aka synthetic data, that’s of the same quality as something created by humans. This can be done with a variety of media — text and speech, for example. At Bria, we’re using this technology to generate high-quality images and video.

A photo is no longer frozen in time — Bria’s generative AI can rapidly adapt visuals for new audiences by creating new facial expressions, an entirely new model and new scenery.

And so, an image is no longer a moment frozen in time

Already at Bria we’ve harnessed the power of generative AI to empower our users to perform many modifications to existing visuals. They can change the models in a photograph — their expressions, age, appearance. Users can even replace the model with a generated person that resonates with their target audience, and then bring them to life in a realistic video.

Bria’s platform empowers users to communicate visually with no need for a camera or Photoshop.

One of my favorite scenes in Mary Poppins is when she leads the children to jump inside Bert’s street sketch. Now you can do that with AI, taking any 2D photograph and generating a realistic 3D world that allows you to figuratively jump inside, although without the singing penguins (so far).

Bria lets you transport yourself into a still image, turning it into a video in seconds..

You can also add or remove objects — historically this was always a challenge because of the holes you leave behind in the picture. But generative AI can fill in those holes to make it look like nothing is missing.

And we also enable people to generate visuals that match their brand guidelines — mood, coloring, adding their logo. You can set these values and apply them instantly among an infinite amount of images.

This is just the beginning. We’re getting closer and closer to the day when someone will just ask a platform to generate a visual and the AI will deliver a perfect result instantly. With no camera, no Photoshop needed. From thoughts to visuals in a few seconds. It sounds like science fiction, but it’s already here.

The gift of creative independence

The potential of this technological evolution is truly incredible. Storytelling is one of the most innate human habits — we have evidence of it going back millennia. Today, professional visual storytelling is restricted to a small group of people who have the skills to realize a vision — either in traditional art forms like painting, sculpture and the like; or digitally with software like Photoshop.

If you’re planning a traditional photo or video shoot — in the movie or music industries, or for creating marketing visuals, for example, you need money and time not just for the filming/photography — but also editing, post-production…and did anyone mention the endless feedback loops?

Generative AI changes that. It’ll empower everyone with creative independence — there’ll no longer be a need to practice for hours to refine your artistic skills — all you’ll need is a vision. Then input that vision into the system, and the AI will generate it for you. Rosebud AI’s technology can already generate realistic scenery, for example.

Examples of synthetic scenery generated by Rosebud AI.

This is nothing short of revolutionary. Today, the people at the top of the creative professions have a combination of a creative mind and the skillset to realize their vision. So there’s a massive untapped resource of creative people who can’t succeed because they lack the executional skills or budget to do so.

But in the future, there’ll no longer be a barrier to creativity. The most creative minds will be at the top of the creative professions, because all they will need is a powerful imagination, and the machines will realize their vision. For example, Hour One can take text and turn it into a talking presenter.

AI-generated presenters developed by Hour One.

We need to be conscious of the risks and mitigate accordingly

There are other aspects to an image no longer being representative of a moment in time that we and everyone in our industry have to take under consideration. Because it means that we can’t necessarily rely on an image as a reliable witness. It will no longer necessarily be accurate to say “seeing is believing”.

Of course, people have been modifying images for centuries to make them fit their narrative. But as it becomes easier and more accessible to do so, we need to adapt ourselves to a world where we can’t necessarily believe our eyes.

If anyone can remove objects from an image in an instant, can we still “believe our eyes”?

Companies developing this technology need to consider how to put safeguards in place to prevent its abuse, particularly as its productization will lead to the democratization of generative AI. We have a responsibility to lead the discussion about how it can be used without compromising ethics and values.

There are exciting times ahead

One thing’s for sure, this is an exciting time for the world of creativity, and arguably, humanity as a whole. I wonder if people realize how close we are to recreating a metaverse that you can experience just like the holodeck from Star Trek — that actually looks real. I look forward to sharing more thoughts about it in the weeks, months and years to come.

I’d love to know what you think about the potential of synthetic media to unleash a new world of creativity. Please share your thoughts in the comments.

Yair Adato is the Co-founder and CEO of Bria.

View full post