NIST launches a new platform to assess generative AI

The National Institute of Standards and Technology (NIST), the U.S. Commerce Department agency that develops and tests tech for the U.S. government, corporations and the broader public, on Monday announced the launch of NIST GenAI, a new program spearheaded by NIST to assess generative AI technologies, including text- and image-generating AI.

A platform designed to evaluate various forms of generative AI tech, NIST GenAI will release benchmarks, help create “content authenticity” detection (i.e. deepfake-checking) systems and encourage the development of software to spot the source of fake or misleading information, explains NIST on its newly-launched NIST GenAI site and in a press release.

“The NIST GenAI program will issue a series of challenge problems designed to evaluate and measure the capabilities and limitations of generative AI technologies,” the press release reads. “These evaluations will be used to identify strategies to promote information integrity and guide the safe and responsible use of digital content.”

NIST GenAI’s first project is a pilot study to build systems that can reliably tell the difference between human-created and AI-generated media, starting with text. (While many services purport to detect deepfakes, studies — and our own testing — have shown them to be unreliable, particularly when it comes to text.) NIST GenAI is inviting teams from academia, industry and research labs to submit either “generators” — AI systems to generate content — or “discriminators” — systems that try to identify AI-generated content.

Generators in the study must generate summaries provided a topic and a set of documents, while discriminators must detect if a given summary is AI-written or not. To ensure fairness, NIST GenAI will provide the data necessary to train generators and discriminators; systems trained on publicly available data won’t be accepted, including but not limited to open models like Meta’s Llama 3.

Registration for the pilot will begin May 1, with the results scheduled to be published in February 2025.

NIST GenAI’s launch — and deepfake-focused study — comes as deepfakes grow exponentially.

According to data from Clarity, a deepfake detection firm, 900% more deepfakes have been created this year compared to the same time frame last year. It’s causing alarm, understandably. A recent poll from YouGov found that 85% of Americans said they were concerned about the spread of misleading deepfakes online.

The launch of NIST GenAI is a part of NIST’s response to President Joe Biden’s executive order on AI, which laid out rules requiring greater transparency from AI companies about how their models work and established a raft of new standards, including for labeling content generated by AI.

It’s also the first AI-related announcement from NIST after the appointment of Paul Christiano, a former OpenAI researcher, to the agency’s AI Safety Institute.

Christiano was a controversial choice for his “doomerist” views; he once predicted that “there’s a 50% chance AI development could end in [humanity’s destruction]” Critics — including scientists within NIST, reportedly — fear Cristiano may encourage the AI Safety Institute to focus to “fantasy scenarios” rather than realistic, more immediate risks from AI.

NIST says that NIST GenAI will inform the AI Safety Institute’s work.