OpenAI, a leading artificial intelligence research organization, has unveiled a groundbreaking audio feature that can convincingly replicate human voices, sparking discussions about the potential risks associated with deepfake technology.
The new feature, called Voice Engine, was showcased through early demos and use cases shared with a select group of about 10 developers, according to a spokesperson from OpenAI. Unlike previous audio generation efforts, Voice Engine can emulate individual voices with specific cadences and intonations, requiring only 15 seconds of recorded audio to recreate a person’s voice accurately.
However, OpenAI has decided against a widespread rollout of the feature after feedback from stakeholders like policymakers, industry experts, educators, and creatives. The company expressed awareness of the serious risks associated with generating human-like speech, especially in contexts like elections, where misinformation can have significant consequences.
The potential for misuse of such technology was highlighted by previous incidents, such as a realistic-sounding phone call impersonating President Joe Biden, which raised concerns about AI’s role in manipulating public perception.
Despite its capabilities, Voice Engine has beneficial applications, as demonstrated by the Norman Prince Neurosciences Institute at Lifespan, a health system using the technology to help patients recover their voices. Additionally, companies like Spotify are exploring translations of audio content using OpenAI’s speech model.
To address ethical concerns, OpenAI is imposing strict usage policies on its partners, requiring consent from original speakers, disclosure of AI-generated voices to listeners, and embedding inaudible watermarks for verification. The company is also seeking feedback from external experts before considering a broader release of the feature.
In a broader context, OpenAI’s development underscores the need for societal resilience against advanced AI technologies, urging sectors like banking to rethink voice authentication and advocating for public education on detecting AI-generated content.
Sources By Agencies