August 29 2024
06:23
8:04 mins read

CEO’s Desk: Ethical Usage of Copyrighted Content in AI Training

The debate over using copyrighted content to train AI splits between unrestricted use in the name of progress, and compensating creators fairly. Our CEO, Idan Dobrecki, reviews both sides of the coin from an advocacy standpoint and gives his view on this hot topic while discussing what is needed to create fair compensation for sustainable AI development

Much has been said about the use of copyrighted content for training artificial intelligence models. On this controversial issue, two opposing views can be found – one that holds that in the name of progress, the training of artificial intelligence models should be allowed using content protected by copyright, since any other option, including licensing deals, will create significant obstacles on the way to improving the technology, and in fact will make it very difficult for small companies to produce high-quality datasets sufficient for training significant models. Supporters of this approach further claim that the doctrine of fair use should be applied in these situations, which allows in certain circumstances to use a work without the permission of the creator (an open question that we will not expand on in this article – can such use be made for commercial purposes as well). Moreover, it can be said that there is logic in this position, since just as a painter, who draws inspiration and learns his craft from the paintings of past colleagues – is not required to pay for that “usage” of past creations – so a similar obligation should not be applied to artificial intelligence companies, which train their models to create new works based on existing ones.

Despite the above, in this article I will argue that the opposite approach – the one that requires recognition of copyrighted content, while finding mechanisms to obtain the consent of the content owners for the purpose of training artificial intelligence models – is the correct one, and not only because it is the more ethical approach, but because it will enable the creation of durable and sustainable business models. I will further claim that these statements are perhaps even more relevant when addressing AI music tools.

Firstly, at the basis of the concept that copyright owners whose works have been used to train artificial intelligence models – should be remunerated, are values of fairness and ethics. Since artificial intelligence models cannot be created without prior training on large amounts of content, protected by copyright, it makes sense from an ethical point of view to find a mechanism that rewards the owners of those works.

Moreover, while it’s true that in a world without artificial intelligence, creators – whether composers, musicians, painters, or writers – learn and draw inspiration from the works of their predecessors without paying royalties, this comparison creates a sense of discomfort as it oversimplifies the profound transformation AI is bringing and underestimates the significance of the current revolution.

Artificial intelligence, being a powerful and highly available tool, is going to revolutionize the creators’ economy in ways we are not yet aware of. The ability to create content in a fraction of the time and with immediate availability is certainly not comparable to the “old” world we are used to. Since we are entering the point in time when standards of fair use for training artificial intelligence models will be determined – the claim that the rules and ways in which artificial intelligence models should be trained must now be established – is a reasonable claim. Given the potential impact of artificial intelligence on the fields of art and creativity, it is not unreasonable that copyright owners want to make sure, at this point in time, that they are compensated for training models based on their works.

Secondly, it is possible that finding means to train artificial intelligence models ethically and fairly, which consider the copyright holders – is actually the most effective way to enhance and improve these models. In a world where creators are not rewarded for training AI models, their incentive to create additional artistic content is reduced. If artists stop creating content (or reduce the scope of their work, or alternatively – reduce its availability so that it is not used for training artificial intelligence models), then the amount of quality data for training models will decrease, which will directly and negatively impact the quality of the models, certainly those that need very specific data. The solution may be training artificial intelligence models on artificial data, i.e. – one that is also created by artificial intelligence, but this solution may lead to a race to the bottom where, at least in the coming years, the quality of data on which the models are trained decreases and due to this – the quality of models also decreases.

Thirdly, if we want to ensure the widespread adoption of artificial intelligence tools, we must address the psychological factors influencing how people perceive them. A transparent and supportive approach – emphasizing that AI is meant to serve as a tool for creators, not a replacement – will help build trust. By encouraging creators to contribute their content for model training and showing them how they can actively participate in this revolution, including through fair financial compensation, we can foster a more positive and accepting attitude toward AI.

In light of the above, the best strategy moving forward seems to be finding ways to harness the creativity of artistic communities for the benefit of artificial intelligence, while ensuring that copyrighted content is acquired in a fair, ethical, and transparent manner. This conclusion may be even more pertinent for AI Music tools, given the unique nature of the music industry.

Unlike other creative sectors, the majority of recorded music is owned by the “Big Three” major record labels – Universal Music Group (UMG), Sony Music Entertainment (SME), and Warner Music Group (WMG). As a result, those looking to train AI music models cannot rely on numerous, scattered copyright holders whose incentives for joint action are relatively low and complex. The three major labels have already shown both the capability and willingness to collaborate in preventing the unethical and illegal use of their copyrighted materials, making it more challenging for those seeking to train AI music models without proper licensing.

Moreover, music, unlike images for example, has a temporal dimension. This added complexity makes training AI music models more challenging compared to image-based models, which deal with static snapshots in time. For example, creating an AI model that generates an image of a cat in a litter box is relatively straightforward, as it involves combining two static elements – an image of a cat and an image of a litter box (this is a simplified example, but it illustrates the point).

These unique challenges suggest that AI music models might require higher-quality datasets for training and, consequently, greater incentives to keep music creators engaged in producing new music, genres, and styles for the benefit of AI music training.

Aiode Desktop app is now Live!

Download for FREE and start creating your STEMs, your song with Aiode Musicians

Want to know more about our musicians and features?

Check out our Discord community, follow us on socials or come back as we update weekly on Aiode’s journey.