Why Bria Stands With Data Owners and Creators For Visual Generative AI
The recent push by OpenAI and Google to classify AI training on copyrighted works as "fair use" represents a critical inflection point for the future...
Explore Our Models on Hugging Face
Try Now
Join Our Community
Bria's Discord
Create Your First API Key
Sign Up
Need More Help?
Contact Us
Explore Our Models on Hugging Face
Try Now
Join Our Community
Bria's Discord
Create Your First API Key
Sign Up
Need More Help?
Contact Us
Explore Our Models on Hugging Face
Try Now
Join Our Community
Bria's Discord
Create Your First API Key
Sign Up
Need More Help?
Contact Us
Explore Our Models on Hugging Face
Try Now
Join Our Community
Bria's Discord
Create Your First API Key
Sign Up
Need More Help?
Contact Us
9 min read
Dr. Efrat Taig : Oct 3, 2024 5:08:59 PM
In the fast-paced world of Artificial Intelligence, breakthrough technologies often struggle to bridge the gap between academic research and practical, commercial applications. As we've seen in our exploration of background replacement using diffusion models, the journey from concept to product is rarely straightforward. And what happens after we solve the technical challenges? How do we ensure cutting-edge AI technologies work and respect ethical considerations and copyright laws? And how can businesses leverage these advancements while fairly compensating the creators and data providers who make it all possible?
This blog will explore these essential questions, examining the intersection of AI technology, industry needs, and ethical considerations.
As the VP of Generative AI Technology at Bria, I've had a front-row seat to these challenges, and I'm excited to share my insights on how we're working to create a more balanced and fair AI ecosystem. We'll explore the multifaceted world of AI commercialization, from the importance of open source in driving innovation to developing new business models that respect creator rights.
Whether you're a developer, a business leader, or simply curious about the future of AI, this discussion will illuminate the often-overlooked aspects of bringing AI from the lab to the real world.
Join me as we explore how companies like Bria are reimagining the AI industry, striving to create a future where technological advancement and ethical considerations coexist.
Let's start with the foundation of any good model: data. At Bria, we've learned that quality always trumps quantity when it comes to data.
Here's what we've discovered
The Two-Phase Training Strategy
We've found that splitting our training into two phases yields impressive results:
The Foreground Preservation Trick
Here's a pro tip: if maintaining the integrity of the original image is crucial (and let's face it, when isn't it?), try this. After training, restore the original foreground. Models sometimes overlook masks, leading to unwanted changes. Our solution? Run a post-processing step to blend the original foreground into the generated image. It's like having your cake and eating it too!
Learning Rate: The Goldilocks Zone
When optimizing your learning rate, breaking the process into phases is useful. Start with a linear warm-up, gradually increasing the learning rate to help the model adjust to the data without making drastic updates. After the warm-up, you can switch to a constant or cyclical learning rate for more stable training. This approach helps prevent overfitting and allows the model to converge more efficiently.
I trained Bria’s 2.3 inpainting network using this concept, and here’s the result for my profile picture. It’s getting close to what I want. The coolest part? I got excellent results fast—not just one out of 10 did well, but the first four were decent.
Does it look believable? Was the photo taken in a library? Not entirely, but it’s heading in that direction.
We’ll explore more complex topics in the upcoming Blogs, including ControlNet Deep Dive—A Different POV and How to Modify ControlNet Input: A Step-by-Step Guide for Input Modification. These posts will explore how to make better adjustments for specific use cases, keep up with the state-of-the-art (SOTA), and tailor the proper methodology to your needs.
In today’s fast-paced world, keeping up with SOTA and figuring out which approach to use is challenging. When aiming for high-performance and natural, tangible products, every project requires custom adjustments and the right tools.
Earlier this year, Bria's 2.3 Inpainting Network used these concepts. What were the results? Well, let's say they're turning heads. We're getting impressive outputs fast—not just 1 out of 10, but the first four attempts were decent. That's the kind of consistency we live for!
Looking Ahead: ControlNets and Advanced Adjustments
But wait, there's more! In upcoming blogs, we'll explore more advanced topics like ControlNet Deep Dive and How to Modify ControlNet Input. We'll discuss making better adjustments for specific use cases and keeping up with the state-of-the-art.
The Bria.ai Difference: Bridging Academia and Industry
From where I stand, the bridge between academia’s groundbreaking theories and the practical needs of industry. It is ♥️ of the challenge for us — algorithm developers and engineers, the ones who are responsible for turning cutting-edge technology into genuine, functional products. We’re often the bridge between academia’s groundbreaking theories and the practical needs of industry. But let’s be honest — how usually do things work “off the shelf”? Rarely.
And that’s exactly why open source is so valuable. The ability to download model weights, tweak the technology, and mold it to our specific use cases is critical. We can retrain on customized data, add a loss function that makes more sense for our goals and fine-tune to perfection. The numbers speak for themselves — look at the millions of downloads for open models like those from Stability AI or the runaway success of Hugging Face. There’s a natural, pressing need for this flexibility, and we all feel it.
But here’s the catch: the moment commercial companies develop those models, we hit a wall. These companies have poured millions — sometimes billions — into their development; understandably, they must protect their investments. Their agenda isn’t open source; it’s about driving revenue from their products. And while that’s perfectly fair, it leaves us — the developers and researchers — out in the cold, unable to access, replicate, or build on the fantastic work.
In recent years, I have observed a notable shift at every conference I attend. I encounter remarkable research, yet without open access to the code, it feels akin to being given the keys to a treasure chest only to find it securely locked. The balance of power is transitioning from academia's open, collaborative spirit to the restricted domains of large corporations and their proprietary models. This shift poses a significant challenge for those who thrive on innovation and collaboration.
Our Attribution Model ensures that every time an image is generated, we can trace it back and compensate the original creators based on their contribution. It's like Spotify for the visual AI world—revolutionary, right?
To be clear, I’m not an advocate for open source at all costs. I recognize the importance of protecting intellectual property and the business models of these companies. However, it’s essential to acknowledge the challenges this creates for the broader development community. There’s still hope for bridging this gap, and in the next section, I’ll introduce a solution we’ve been developing at Bria to address these challenges.
This is precisely the challenge that Bria is working to solve. We’re trying to create a new market that bridges the gap between these two worlds, something we call: “Source Code Available.” On one hand, we’re a business — a startup company that needs to grow, innovate, and sell its products. (Fun fact: a company that doesn’t sell doesn’t have much of a future! It’s one of those hard lessons you learn after leaving the university halls). And that’s precisely what we do — we sell models! Our business sells our models directly to other companies (B2B – business-to-business), allowing them to integrate Bria’s models into their products or services. We provide foundational and auxiliary models that can be customized to fit their specific needs and incorporated into the solutions they develop.
We’ve developed models compatible with open-source frameworks (like SDXL and others), ensuring that all the advancements built upon them will also be compatible with Bria’s model. We offer a range of models like Text2Image, Control Nets, IP adaptors, and more. You can check out all our offerings on Hugging Face. Companies can purchase these models, adapt them, and train them to meet their specific needs. From Bria, you can get the weights, training codes, and evaluation scripts, which are all customized and ready for developers. And everything is 100% legal!
Now, you might be asking yourself — why should we pay Bria XXX dollars for a model when we can download it for free with just a click from Stability?
Great question. And my answer? Even better. 😊
The free models available for download today are often trained on data scraped from the internet. The LAION dataset, for example, is well-known for being a broad sweep of internet content. Artists who have spent years making a living from their photography, illustrations, or advertising work — suddenly, their content was scraped from the internet, and now anyone can use it to train models. Remember that little © symbol? It was supposed to mean something. But overnight, its significance vanished, as did the respect for creators’ rights.
Or did it?
Only time will tell whether this disregard for copyright will continue. I remember watching the news one day and seeing an artist named Asaf Hanuka, who never shared his art with AI model training companies, didn’t know what AI is.. sitting there, stunned as the host showed him examples of AI-generated images that mimicked his style by using a specific prompt (“create an illustration in the style of Asaf Hanuka”). His reaction was inspiring (you can see it here)— alongside the shock and complex emotions, Asaf realized that the “genie” was already out of the bottle, and there was no turning back the clock on technology. Now, the challenge is to move forward and figure out how to harness this technology somewhat, as well as those who created the data in the first place.
Whether this copyright concern resonates with you on a personal level or not, it’s a severe issue for businesses. Companies that generate millions of dollars — legal entities that create visual content to push their business forward — can’t afford to deal with uncertainty around copyright (can’t afford to gamble with copyright uncertainty). Whether it’s images for marketing, catalogs, branding, graphics for social media, or videos for promotion and training, these visual assets are crucial for campaigns, investor presentations, customer pitches, and sales support. A business manages and creates thousands of images at any given time. They can’t afford to take the risk of unclear copyright usage.
So, here’s the real question — why take shortcuts (or worse, steal) when you can do things correctly and make it a win for everyone? And yes, there’s a better path forward.
What did we do at Bria? It’s effortless (in hindsight, of course — creating it was anything but simple, so don’t try this at home!).
So, what did we do? Well, we kind of “Spotify-ed” the Gen AI visual ecosystem. Like Spotify did for music, we took inspiration from their business model. Remember the days of downloading music for free from sketchy sites, where the only ones making money were the folks running the servers, and the ads plastered all over the screen? Spotify flipped the script and brought the profits back to the music creators.
We’ve done something similar. We created an alliance of data contributors, and our business model works like this:
First, we train our models on legal data — provided by our partner image suppliers. Then, we sell these models (and yes, the code is accessible to developers for further development — did we mention that already? 😉). And here’s where the real magic of our solution begins — when these models move into production and companies start generating revenue from them.
Bria’s agent saves the embedding vector whenever an image is generated. Then, using a model we call the “Attribution Model,” we can trace back and identify which images in the training set contributed the most to creating the new image. Based on how much they contributed to the final result, we pay royalties to the image providers, artists, and creators who made it all possible.
The more the generated image resembles your original work, the more money you earn. If you want me to define “resemble,” I can — but that’s a topic for another blog post about our Attribution Model.
We believe in releasing our model to benefit research and want to give back to the community. However, we must do this while protecting the interests of our data providers and respecting the copyright of artists. That’s why we distinguish between the academic community — where we release and share the code for research purposes — and businesses to whom we sell the models for fair commercial use. And by fair, we mean genuinely fair (remember, we pay royalties — unlike many other companies).
If you’ve made it this far — congratulations, you deserve a medal. Or maybe the real prize is that you now understand how Bria works and that it’s possible to bring the concept of respecting copyrights and paying royalties into the world of generative AI. Shocking, right?
Wrapping Up: A Different Way Forward with Bria
I hope you got a good peek into how we’re doing things differently at Bria. Balancing innovation with fairness isn’t easy, but someone must do it. Stick around for the following blogs, where I’ll show you how to use Control Net to swap out backgrounds in your images — because who doesn’t love better results?
Even with all the training and improvements, the results of the advanced approach are much better than those of the naive method, but they’re still imperfect. One of our biggest challenges was data leakage, where parts of the foreground object “bleed” into the background, messing up the separation. We also ran into issues with texture consistency and lighting.
Most of these problems were resolved by applying the ideas we discussed on the blog within the ControlNet framework, which gave us much finer control over the image generation process. I’ll dive deeper into that in the upcoming blogs ControlNet Deep Dive—A Different POV and How to Modify ControlNet Input: A Step-by-Step Guide for Input Modification. Stay tuned if you want to take your results to the next level!
Contact us for a deeper understanding of our Generative AI capabilities:
The recent push by OpenAI and Google to classify AI training on copyrighted works as "fair use" represents a critical inflection point for the future...
The company will expand its state-of-the-art visual generative AI platform to bolster creation of scalable AI-driven visual products for enterprises...
At Bria AI, we are constantly pushing the boundaries of what’s possible with visual generative AI. As part of our commitment to transparency, trust,...