Real-time deepfake detection
Use the streaming API for real-time results
Behavioral Signals Streaming API also supports streaming real-time audio for Deepfake Detection.
For integrating with your application you can either use the low-level gRPC
API with your language of choice (Java, Go, NodeJS), or use the Python SDK.
ℹ️ Description of streaming
With streaming deepfake detection you can send chunks of audio to the API and retrieve results in real-time. Some example use cases include:
- Streaming live calls, for example in call centers
- Agentic AI applications where the user's voice is processed live
In the configuration you can switch between segment and utterance level results.
- In segment level the results are streamed from the server at fixed intervals - every 2 seconds. This is ideal if you want to have short time windows. However keep in mind that the shorter the analysis window is, the less accurate the results may be.
- In utterance level, the results are streamed from the server at the end of each utterance. This has the advantage that the whole utterance is analyzed in order to make the final decision, resulting in more accurate predictions.
Notes
- The API detects voice activity first, before analyzing results. This means that if the streamed audio contains silence or non-speech, the API won't stream back any results for these chunks.
- Best results are obtained using 16kHz mono audio. Encoding has to be linear PCM signed 16-bit.
- The chunk duration can be arbitrary in size, though we recommend values between 100ms and 500ms.
🔨 Using a bare gRPC client
Using the api.proto
file you can generate a bidirectional gRPC client in your language of choice. We'll use Java for our example, since the Python use-case is already covered by the SDK in the next section. You can find more documentation on gRPC in the official page.
Prerequisites
To follow this example you need to have Java and Maven installed. You can follow this guide, depending on your system: https://maven.apache.org/install.html. You should also first create a project to obtain your cid and API token.
Step 1: Create a new Maven project
You can create a new project for this example by running:
mvn archetype:generate -DgroupId=com.yourcompany.grpc \
-DartifactId=behavioralsignals-grpc-client \
-DarchetypeArtifactId=maven-archetype-quickstart \
-DinteractiveMode=false
cd behavioralsignals-grpc-client
If you already have an active project you can skip this step and proceed to add the dependencies.
Step 2: Add gRPC dependencies
Edit the pom.xml
and add the required dependencies and codegen plugin:
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.yourcompany.grpc</groupId>
<artifactId>behavioralsignals-grpc-client</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<maven.compiler.source>17</maven.compiler.source>
<maven.compiler.target>17</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<grpc.version>1.72.0</grpc.version>
<protobuf.version>4.30.2</protobuf.version>
</properties>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>io.grpc</groupId>
<artifactId>grpc-bom</artifactId>
<version>${grpc.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<!-- gRPC & Protobuf -->
<dependency>
<groupId>io.grpc</groupId>
<artifactId>grpc-services</artifactId>
</dependency>
<dependency>
<groupId>io.grpc</groupId>
<artifactId>grpc-netty-shaded</artifactId>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>io.grpc</groupId>
<artifactId>grpc-protobuf</artifactId>
</dependency>
<dependency>
<groupId>io.grpc</groupId>
<artifactId>grpc-stub</artifactId>
</dependency>
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
<version>${protobuf.version}</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<extensions>
<extension>
<groupId>kr.motd.maven</groupId>
<artifactId>os-maven-plugin</artifactId>
<version>1.7.0</version>
</extension>
</extensions>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.5.0</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<createDependencyReducedPom>false</createDependencyReducedPom>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>com.yourcompany.grpc.App</mainClass> <!-- Update this to match your main class -->
</transformer>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>io.github.ascopes</groupId>
<artifactId>protobuf-maven-plugin</artifactId>
<version>3.4.2</version>
<configuration>
<protocVersion>${protobuf.version}</protocVersion>
<binaryMavenPlugins>
<binaryMavenPlugin>
<groupId>io.grpc</groupId>
<artifactId>protoc-gen-grpc-java</artifactId>
<version>${grpc.version}</version>
<options>jakarta_omit,@generated=omit</options>
</binaryMavenPlugin>
</binaryMavenPlugins>
</configuration>
<executions>
<execution>
<id>generate</id>
<goals>
<goal>generate</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
Step 3: Copy the api.proto file and generate stubs
Copy the api.proto
file to src/main/proto/api.proto
. Then edit the file and add these lines right after package behavioral_api.grpc.v1;
, to configure the generated classes' path:
option java_multiple_files = true;
option java_package = "com.yourcompany.grpc.api.proto";
option java_outer_classname = "StreamingApiProto";
Then compile the application:
mvn clean compile
Step 4: Use the generated code
You can use the generated classes to interact with the Streaming API. For this example we'll be streaming bytes from a file to demonstrate the functionality. Let's modify src/main/java/com/yourcompany/grpc/App.java
The first step is to define the connection parameters (url, port and TLS) and the client stub. We also use a variable to store the connection state:
// Define the grpc channel
ManagedChannel channel = ManagedChannelBuilder.forTarget("streaming.behavioralsignals.com:443")
.useTransportSecurity()
.build();
// Create the stub service
BehavioralStreamingApiGrpc.BehavioralStreamingApiStub stub = BehavioralStreamingApiGrpc.newStub(channel);
final AtomicBoolean isConnectionActive = new AtomicBoolean(true);
Next we define the request observer that is used to send messages to the detection endpoint. Since it's a bidirectional stream, we also define the response handler. Here you should customize it with your own logic:
StreamObserver<AudioStream> requestObserver = stub.deepfakeDetection(new StreamObserver<StreamResult>() {
@Override
public void onNext(StreamResult result) {
System.out.println("Received result: " + result);
// Your custom response handling logic goes here
}
@Override
public void onError(Throwable t) {
System.err.println("Stream error: " + t.getMessage());
isConnectionActive.set(false);
}
@Override
public void onCompleted() {
System.out.println("Server closed stream.");
isConnectionActive.set(false);
}
});
We'll read a file from the file system and convert it to the appropriate format:
// Define the target format of the audio
AudioFormat targetFormat = new AudioFormat(
AudioFormat.Encoding.PCM_SIGNED,
16000.0f, // Sample rate
16, // Sample size in bits
1, // Channels (mono)
2, // Frame size (2 bytes = 16-bit mono)
16000.0f, // Frame rate
false // Little endian
);
AudioInputStream originalStream = AudioSystem.getAudioInputStream(Paths.get(filePath).toFile());
// Resample and downmix
AudioInputStream convertedStream = AudioSystem.getAudioInputStream(targetFormat, originalStream);
Now we are ready to start streaming. The first message should always contain the configuration of the stream:
// The first message is always the config.
AudioStream config = AudioStream.newBuilder()
.setConfig(AudioConfig.newBuilder()
.setEncoding(AudioEncoding.LINEAR_PCM)
.setSampleRateHertz(16000)
.setLevel(Level.utterance)
.build())
.setCid(cid)
.setXAuthToken(token)
.build();
requestObserver.onNext(config);
The rest of the messages include the audio bytes. Here we send chunks of 250ms:
// Stream the contents of the file
byte[] buffer = new byte[8000]; // 250ms of audio
int bytesRead;
while ((bytesRead = convertedStream.read(buffer)) != -1) {
AudioStream content = AudioStream.newBuilder()
.setCid(cid)
.setXAuthToken(token)
.setAudioContent(ByteString.copyFrom(buffer, 0, bytesRead))
.build();
requestObserver.onNext(content);
}
Finally we can gracefully close the connection once we are done:
// Wait till all messsages have been processed and finally close stream gracefully
requestObserver.onCompleted();
while (isConnectionActive.get()) Thread.sleep(1000);
channel.shutdownNow();
The complete App.java
file looks like this:
package com.yourcompany.grpc;
import com.google.protobuf.ByteString;
import com.yourcompany.grpc.api.proto.*;
import io.grpc.ManagedChannel;
import io.grpc.ManagedChannelBuilder;
import io.grpc.stub.StreamObserver;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.UnsupportedAudioFileException;
import java.io.IOException;
import java.nio.file.Paths;
import java.util.concurrent.atomic.AtomicBoolean;
/**
* Hello world!
*/
public class App {
public static void main(String[] args) throws UnsupportedAudioFileException, IOException, InterruptedException {
// Parse arguments
if (args.length < 3) {
System.err.println("Usage: java -jar your-app.jar <cid> <token> <path/to/audio.wav>");
System.exit(1);
}
int cid = Integer.parseInt(args[0]);
String token = args[1];
String filePath = args[2];
// Define the grpc channel
ManagedChannel channel = ManagedChannelBuilder.forTarget("streaming.behavioralsignals.com:443")
.useTransportSecurity()
.build();
// Create the stub service
BehavioralStreamingApiGrpc.BehavioralStreamingApiStub stub = BehavioralStreamingApiGrpc.newStub(channel);
final AtomicBoolean isConnectionActive = new AtomicBoolean(true);
// Open deepfake detection stream and define the response handler
StreamObserver<AudioStream> requestObserver = stub.deepfakeDetection(new StreamObserver<StreamResult>() {
@Override
public void onNext(StreamResult result) {
System.out.println("Received result: " + result);
// Your custom response handling logic goes here
}
@Override
public void onError(Throwable t) {
System.err.println("Stream error: " + t.getMessage());
isConnectionActive.set(false);
}
@Override
public void onCompleted() {
System.out.println("Server closed stream.");
isConnectionActive.set(false);
}
});
// Define the target format of the audio
AudioFormat targetFormat = new AudioFormat(
AudioFormat.Encoding.PCM_SIGNED,
16000.0f, // Sample rate
16, // Sample size in bits
1, // Channels (mono)
2, // Frame size (2 bytes = 16-bit mono)
16000.0f, // Frame rate
false // Little endian
);
AudioInputStream originalStream = AudioSystem.getAudioInputStream(Paths.get(filePath).toFile());
// Resample and downmix
AudioInputStream convertedStream = AudioSystem.getAudioInputStream(targetFormat, originalStream);
// The first message is always the config.
AudioStream config = AudioStream.newBuilder()
.setConfig(AudioConfig.newBuilder()
.setEncoding(AudioEncoding.LINEAR_PCM)
.setSampleRateHertz(16000)
.setLevel(Level.utterance)
.build())
.setCid(cid)
.setXAuthToken(token)
.build();
requestObserver.onNext(config);
// Stream the contents of the file
byte[] buffer = new byte[8000]; // 250ms of audio
int bytesRead;
while ((bytesRead = convertedStream.read(buffer)) != -1) {
AudioStream content = AudioStream.newBuilder()
.setCid(cid)
.setXAuthToken(token)
.setAudioContent(ByteString.copyFrom(buffer, 0, bytesRead))
.build();
requestObserver.onNext(content);
}
// Wait till all messsages have been processed and finally close stream gracefully
requestObserver.onCompleted();
while (isConnectionActive.get()) Thread.sleep(1000);
channel.shutdownNow();
}
}
Step 5: Run the application
Compile and run the application with the command below. Make sure to modify cid, token and filePath with your own:
mvn clean package
java -jar target/behavioralsignals-grpc-client-1.0-SNAPSHOT.jar <your-cid> <yout-api-token> /path/to/audio.wav
The output of the app should look like this:
Received result: cid: <your-cid>
pid: 8
result {
id: "14"
start_time: "0.00"
end_time: "9.00"
task: "deepfake"
prediction {
label: "bonafide"
posterior: "0.995"
logit: ""
}
prediction {
label: "spoofed"
posterior: "0.005"
logit: ""
}
final_label: "bonafide"
embedding: ""
level: utterance
}
Received result: cid: <your-cid>
pid: 8
message_id: 1
result {
id: "24"
start_time: "10.00"
end_time: "15.00"
task: "deepfake"
prediction {
label: "bonafide"
posterior: "0.9991"
logit: ""
}
prediction {
label: "spoofed"
posterior: "0.0009"
logit: ""
}
final_label: "bonafide"
embedding: ""
level: utterance
}
Server closed stream.
🐍 Using Python SDK
With our SDK you can easily start a deepfake detection stream as such:
from behavioralsignals import Client, StreamingOptions
from behavioralsignals.utils import make_audio_stream
cid = "<your-cid>"
token = "<your-api-token>"
file_path = "/path/to/audio.wav"
client = Client(cid, token)
audio_stream, sample_rate = make_audio_stream(file_path, chunk_size=250)
options = StreamingOptions(sample_rate=sample_rate, encoding="LINEAR_PCM")
for result in client.deepfakes.stream_audio(audio_stream=audio_stream, options=options):
print(result)
The result is a StreamingResultResponse
object defined here.
🕐 Retrieving results from past streams
You can always retrieve results from past streams by using the get results endpoint or the SDK, with the pid that corresponds to the desired stream.
Updated 22 days ago