Hiya, folks, and welcome to TechCrunch’s regular AI newsletter.
This week in AI, generative AI is beginning to spam up academic publishing — a discouraging new development on the disinformation front.
In a post on Retraction Watch, a blog that tracks recent retractions of academic studies, assistant professors of philosophy Tomasz Żuradzk and Leszek Wroński wrote about three journals published by Addleton Academic Publishers that appear to be made up entirely of AI-generated articles.
The journals contain papers that follow the same template, overstuffed with buzzwords like “blockchain,” “metaverse,” “internet of things” and “deep learning.” They list the same editorial board — 10 members of whom are deceased — and a nondescript address in Queens, New York, that appears to be a house.
So what’s the big deal? you might ask. Isn’t flipping through AI-generated spammy content simply the cost of doing business on the internet these days?
Well, yes. But the fake journals show how easy it is to game the systems used to evaluate researchers for promotions and hiring — and this could be a bellwether for knowledge workers in other industries.
On at least one widely used evaluation system, CiteScore, the journals rank in the top 10 for philosophy research. How is this possible? They extensively cross-cite each other. (CiteScore considers citations in its calculations.) Żuradzk and Wroński find that, of 541 citations in one of Addleton’s journals, 208 come from the publisher’s other fake publications.
“[These rankings] frequently serve universities and funding bodies as indicators of the quality of research,” Żuradzk and Wroński wrote. “They play a crucial role in decisions regarding academic awards, hiring and promotion, and thus may influence the publication strategies of researchers.”
One could argue that CiteScore is the problem — clearly it’s a flawed metric. And that’s not a wrong argument to make. But it’s also not wrong to say that generative AI and its abuse are disrupting systems on which people’s livelihoods depend in unexpected — and potentially quite damaging — ways.
There’s a future in which generative AI causes us to rethink and reengineer systems like CiteScore to be more equitable, holistic and inclusive. The grimmer alternative — and the one that’s playing out now — is a future in which generative AI continues to run amok, wreaking havoc and ruining professional lives.
I sure hope we course-correct soon.
News
DeepMind’s soundtrack generator: DeepMind, Google’s AI research lab, says it’s developing AI tech to generate soundtracks for videos. DeepMind’s AI takes the description of a soundtrack (e.g., “jellyfish pulsating under water, marine life, ocean”) paired with a video to create music, sound effects and even dialogue that matches the characters and tone of the video.
A robot chauffeur: Researchers at the University of Tokyo developed and trained a “musculoskeletal humanoid” called Musashi to drive a small electric car through a test track. Equipped with two cameras standing in for human eyes, Musashi can “see” the road in front of it as well as the views reflected in the car’s side mirrors.
A new AI search engine: Genspark, a new AI-powered search platform, taps generative AI to write custom summaries in response to search queries. It’s raised $ 60 million so far from investors, including Lanchi Ventures; the company’s last funding round valued it at $ 260 million post-money, a respectable figure as Genspark goes up against rivals like Perplexity.
How much does ChatGPT cost?: How much does ChatGPT, OpenAI’s ever-expanding AI-powered chatbot platform, cost? It’s a tougher question to answer than you might think. To keep track of the various ChatGPT subscription options available, we’ve put together an updated guide to ChatGPT pricing.
Research paper of the week
Autonomous vehicles face an endless variety of edge cases, depending on the location and situation. If you’re on a two-lane road and someone puts their left blinker on, does that mean they’re going to change lanes? Or that you should pass them? The answer may depend on whether you’re on I-5 or the Autobahn.
A group of researchers from Nvidia, USC, UW, and Stanford show in a paper just published at CVPR that a lot of ambiguous or unusual circumstances can be resolved by, if you can believe it, having an AI read the local drivers’ handbook.
Their Large Language Driving Assistant, or LLaDa, gives LLM access to — not even fine-tuning on — the driving manual for a state, country, or region. Local rules, customs, or signage are found in the literature and, when an unexpected circumstance occurs like a honk, high beam, or herd of sheep, an appropriate action (pull over, stop turn, honk back) is generated.
It’s by no means a full end-to-end driving system, but it shows an alternate path to a “universal” driving system that still encounters surprises. Plus perhaps a way for the rest of us to know why we’re being honked at when visiting parts unknown.
Model of the week
On Monday, Runway, a company building generative AI tools geared toward film and image content creators, unveiled Gen-3 Alpha. Trained on a vast number of images and videos from both public and in-house sources, Gen-3 can generate video clips from text descriptions and still images.
Runway says that Gen-3 Alpha delivers a “major” improvement in generation speed and fidelity over Runway’s previous flagship video model, Gen-2, as well as fine-grained controls over the structure, style and motion of the videos that it creates. Gen-3 can also be tailored to allow for more “stylistically controlled” and consistent characters, Runway says, targeting “specific artistic and narrative requirements.”
Gen-3 Alpha has its limitations — including the fact that its footage maxes out at 10 seconds. However, Runway co-founder Anastasis Germanidis promises that it’s just the first of several video-generating models to come in a next-gen model family trained on Runway’s upgraded infrastructure.
Gen-3 Alpha is only the latest generative video system of several to emerge on the scene in recent months. Others include OpenAI’s Sora, Luma’s Dream Machine and Google’s Veo. Together, they threaten to upend the film and TV industry as we know it — assuming they can beat copyright challenges.
Grab bag
AI won’t be taking your next McDonald’s order.
McDonald’s this week announced that it would remove automated order-taking tech, which the fast-food chain had been testing for the better part of three years, from more than 100 of its restaurant locations. The tech — co-developed with IBM and installed in restaurant drive-thrus — went viral last year for its propensity to misunderstand customers and make mistakes.
A recent piece in the Takeout suggests that AI is losing its grip on fast-food operators broadly, who not long ago expressed enthusiasm for the tech and its potential to boost efficiency (and reduce labor costs). Presto, a major player in the space for AI-assisted drive-thru lanes, recently lost a major customer, Del Taco, and faces mounting losses.
The issue is inaccuracy.
McDonald’s CEO Chris Kempczinski told CNBC in June 2021 that its voice-recognition technology was accurate about 85% of the time, but that human staff had to assist with about one in five orders. The best version of Presto’s system, meanwhile, only completes roughly 30% of orders without the help of a human being, according to the Takeout.
So while AI is decimating certain segments of the gig economy, it seems that some jobs — particularly those that require understanding a diverse range of accents and dialects — can’t be automated away. For now, at least.