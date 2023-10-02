Meta, formerly known as Facebook, has revealed that it used public posts from Facebook and Instagram to train its new Meta AI virtual assistant. The company emphasized that private posts shared only among family and friends were excluded to protect consumers’ privacy. Additionally, private chats on Meta’s messaging services were not used as training data for the AI model.

To safeguard user privacy further, Meta took additional steps to filter out private details from public datasets used during the training process. Nick Clegg, Meta’s President of Global Affairs, stated that they made an effort to exclude datasets with a heavy amount of personal information. Clegg also mentioned that websites like LinkedIn, which raised privacy concerns, were deliberately not used as sources.

This move comes at a time when tech companies are facing criticism for using internet-scraped information without permission to train their AI models. Concerns have been raised about the potential use of private or copyrighted materials, which could lead to copyright infringement lawsuits. Meta’s new AI assistant, Meta AI, was unveiled at the company’s annual Connect conference, which focused on artificial intelligence.

Meta AI has the capability to generate text, audio, and imagery, and it has real-time information access through a partnership with Microsoft’s Bing search engine. The training data for Meta AI included public Facebook and Instagram posts, containing text and photos, which were used to train image generation. The chat functions were supplemented with publicly available and annotated datasets.

Meta acknowledges the potential for litigation regarding the use of copyrighted materials. Nick Clegg believes that creative content falls under fair use, which permits limited use of protected works for commentary, research, and parody. However, he expects this issue to be resolved through litigation.

The topic of AI and copyrighted material is a complex one. OpenAI has recently announced that they are allowing artists to opt out of training data, but the process is not straightforward. This problem will continue to be a significant issue in the AI field as companies strive to balance innovation with copyright protection.

Source: Reuters