1. Tech giants are aggressively seeking new data sources to power AI systems.
2. Meta executives held daily meetings in 2021 to develop a plan related to data sourcing for AI.
3. Meta considered buying a publishing house, discussed copyright concerns, and ultimately decided to operate under fair use guidelines for training AI systems.
Tech giants such as Meta are in a rush to find new sources of data to power their AI development. The company held daily meetings last year to address the issue, considering options such as buying Simon & Schuster or paying for licensing rights to books. Despite concerns over possible copyright infringements, Meta had already gathered summaries from copyrighted sources.
During the meetings, Meta considered collecting data from potentially copyrighted sources without obtaining proper licensing deals. When legal concerns were raised, executives decided to rely on the precedent set by the Authors Guild vs. Google case, in which Google was allowed to scan and digitize books under fair use guidelines. Meta’s lawyers felt they could train their AI systems using similar guidelines.
As AI systems become more advanced, tech companies are under pressure to collect more data, raising concerns about copyright compliance. OpenAI, for example, was suspected of using YouTube to train its video generator, but these allegations were denied by the company’s CTO. For Meta, the issue of data acquisition and copyright usage has become crucial to staying ahead in the AI arms race.