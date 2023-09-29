Meta Platforms, the parent company of Facebook and Instagram, has revealed details about the data used to train parts of its new Meta AI virtual assistant. According to Meta’s President of Global Affairs, Nick Clegg, the company used public posts from Facebook and Instagram for training, but excluded private posts shared only with family and friends in order to prioritize consumer privacy. Clegg also stated that private chats on Meta’s messaging services were not used as training data, and that steps were taken to filter private details from public datasets.

Meta made a conscious decision not to use certain websites, such as LinkedIn, due to privacy concerns. The company aims to exclude datasets that contain a heavy preponderance of personal information. The majority of data used for training Meta AI was publicly available.

The collection and use of public data tech companies for training AI models has faced criticism, as concerns arise about privacy and copyright issues. Meta’s AI tools, which include the Meta AI assistant, have been unveiled at the company’s annual Connect conference. The assistant utilizes the Llama 2 language model and a new model called Emu for generating text, audio, and imagery. Real-time information is accessed through a partnership with Microsoft’s Bing search engine.

Meta AI was trained using public posts from Facebook and Instagram, encompassing both text and photos. The image generation elements of the assistant were trained with these posts, while the chat functions relied on the Llama 2 model supplemented with publicly available and annotated datasets. Meta AI may also utilize interactions with users to enhance its features in the future.

Meta has imposed safety restrictions on the content generated Meta AI, such as a ban on creating photo-realistic images of public figures. The issue of copyrighted materials is expected to result in litigation, as companies navigate the boundaries of fair use doctrine. Meta spokesperson, Nick Clegg, anticipates a “fair amount of litigation” to determine whether the use of creative content falls within fair use. Meta’s spokesperson highlighted the company’s new terms of service, which prohibit users from generating content that violates privacy and intellectual property rights.

