Real-time deepfake detection

Use the streaming API for real-time results

Behavioral Signals Streaming API also supports streaming real-time audio for Deepfake Detection.

For integrating with your application you can either use the low-level gRPC API with your language of choice (Java, Go, NodeJS), or use the Python SDK.

ℹ️ Description of streaming

With streaming deepfake detection you can send chunks of audio to the API and retrieve results in real-time. Some example use cases include:

  • Streaming live calls, for example in call centers
  • Agentic AI applications where the user's voice is processed live

In the configuration you can switch between segment and utterance level results.

  • In segment level the results are streamed from the server at fixed intervals - every 2 seconds. This is ideal if you want to have short time windows. However keep in mind that the shorter the analysis window is, the less accurate the results may be.
  • In utterance level, the results are streamed from the server at the end of each utterance. This has the advantage that the whole utterance is analyzed in order to make the final decision, resulting in more accurate predictions.

📘

Notes

  • The API detects voice activity first, before analyzing results. This means that if the streamed audio contains silence or non-speech, the API won't stream back any results for these chunks.
  • Best results are obtained using 16kHz mono audio. Encoding has to be linear PCM signed 16-bit.
  • The chunk duration can be arbitrary in size, though we recommend values between 100ms and 500ms.

🔨 Using a bare gRPC client

Using the api.proto file you can generate a bidirectional gRPC client in your language of choice. We'll use Java for our example, since the Python use-case is already covered by the SDK in the next section. You can find more documentation on gRPC in the official page.

Prerequisites

To follow this example you need to have Java and Maven installed. You can follow this guide, depending on your system: https://maven.apache.org/install.html. You should also first create a project to obtain your cid and API token.

Step 1: Create a new Maven project

You can create a new project for this example by running:

mvn archetype:generate -DgroupId=com.yourcompany.grpc \
  -DartifactId=behavioralsignals-grpc-client \
  -DarchetypeArtifactId=maven-archetype-quickstart \
  -DinteractiveMode=false
cd behavioralsignals-grpc-client

If you already have an active project you can skip this step and proceed to add the dependencies.

Step 2: Add gRPC dependencies

Edit the pom.xml and add the required dependencies and codegen plugin:

<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
         http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>com.yourcompany.grpc</groupId>
  <artifactId>behavioralsignals-grpc-client</artifactId>
  <version>1.0-SNAPSHOT</version>

  <properties>
    <maven.compiler.source>17</maven.compiler.source>
    <maven.compiler.target>17</maven.compiler.target>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <grpc.version>1.72.0</grpc.version>
    <protobuf.version>4.30.2</protobuf.version>
  </properties>

  <dependencyManagement>
    <dependencies>
      <dependency>
        <groupId>io.grpc</groupId>
        <artifactId>grpc-bom</artifactId>
        <version>${grpc.version}</version>
        <type>pom</type>
        <scope>import</scope>
      </dependency>
    </dependencies>
  </dependencyManagement>

  <dependencies>
    <!-- gRPC & Protobuf -->
    <dependency>
      <groupId>io.grpc</groupId>
      <artifactId>grpc-services</artifactId>
    </dependency>
    <dependency>
      <groupId>io.grpc</groupId>
      <artifactId>grpc-netty-shaded</artifactId>
      <scope>runtime</scope>
    </dependency>
    <dependency>
      <groupId>io.grpc</groupId>
      <artifactId>grpc-protobuf</artifactId>
    </dependency>
    <dependency>
      <groupId>io.grpc</groupId>
      <artifactId>grpc-stub</artifactId>
    </dependency>
    <dependency>
      <groupId>com.google.protobuf</groupId>
      <artifactId>protobuf-java</artifactId>
      <version>${protobuf.version}</version>
    </dependency>

    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
  </dependencies>

  <build>
    <extensions>
      <extension>
        <groupId>kr.motd.maven</groupId>
        <artifactId>os-maven-plugin</artifactId>
        <version>1.7.0</version>
      </extension>
    </extensions>

    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-shade-plugin</artifactId>
        <version>3.5.0</version>
        <executions>
          <execution>
            <phase>package</phase>
            <goals>
              <goal>shade</goal>
            </goals>
            <configuration>
              <createDependencyReducedPom>false</createDependencyReducedPom>
              <transformers>
                <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                  <mainClass>com.yourcompany.grpc.App</mainClass> <!-- Update this to match your main class -->
                </transformer>
                <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
              </transformers>
            </configuration>
          </execution>
        </executions>
      </plugin>
      <plugin>
        <groupId>io.github.ascopes</groupId>
        <artifactId>protobuf-maven-plugin</artifactId>
        <version>3.4.2</version>
        <configuration>
          <protocVersion>${protobuf.version}</protocVersion>
          <binaryMavenPlugins>
            <binaryMavenPlugin>
              <groupId>io.grpc</groupId>
              <artifactId>protoc-gen-grpc-java</artifactId>
              <version>${grpc.version}</version>
              <options>jakarta_omit,@generated=omit</options>
            </binaryMavenPlugin>
          </binaryMavenPlugins>
        </configuration>
        <executions>
          <execution>
            <id>generate</id>
            <goals>
              <goal>generate</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>

Step 3: Copy the api.proto file and generate stubs

Copy the api.proto file to src/main/proto/api.proto. Then edit the file and add these lines right after package behavioral_api.grpc.v1;, to configure the generated classes' path:

option java_multiple_files = true;
option java_package = "com.yourcompany.grpc.api.proto";
option java_outer_classname = "StreamingApiProto";

Then compile the application:

mvn clean compile

Step 4: Use the generated code

You can use the generated classes to interact with the Streaming API. For this example we'll be streaming bytes from a file to demonstrate the functionality. Let's modify src/main/java/com/yourcompany/grpc/App.java

The first step is to define the connection parameters (url, port and TLS) and the client stub. We also use a variable to store the connection state:

// Define the grpc channel
ManagedChannel channel = ManagedChannelBuilder.forTarget("streaming.behavioralsignals.com:443")
          .useTransportSecurity()
          .build();

// Create the stub service
BehavioralStreamingApiGrpc.BehavioralStreamingApiStub stub = BehavioralStreamingApiGrpc.newStub(channel);
final AtomicBoolean isConnectionActive = new AtomicBoolean(true);

Next we define the request observer that is used to send messages to the detection endpoint. Since it's a bidirectional stream, we also define the response handler. Here you should customize it with your own logic:

StreamObserver<AudioStream> requestObserver = stub.deepfakeDetection(new StreamObserver<StreamResult>() {
  @Override
  public void onNext(StreamResult result) {
              System.out.println("Received result: " + result);
              // Your custom response handling logic goes here
            }

  @Override
  public void onError(Throwable t) {
              System.err.println("Stream error: " + t.getMessage());
              isConnectionActive.set(false);
            }

  @Override
  public void onCompleted() {
              System.out.println("Server closed stream.");
              isConnectionActive.set(false);
            }
});

We'll read a file from the file system and convert it to the appropriate format:

// Define the target format of the audio
AudioFormat targetFormat = new AudioFormat(
          AudioFormat.Encoding.PCM_SIGNED,
          16000.0f,     // Sample rate
          16,           // Sample size in bits
          1,            // Channels (mono)
          2,            // Frame size (2 bytes = 16-bit mono)
          16000.0f,     // Frame rate
          false         // Little endian
        );
AudioInputStream originalStream = AudioSystem.getAudioInputStream(Paths.get(filePath).toFile());
// Resample and downmix
AudioInputStream convertedStream = AudioSystem.getAudioInputStream(targetFormat, originalStream);

Now we are ready to start streaming. The first message should always contain the configuration of the stream:

// The first message is always the config.
AudioStream config = AudioStream.newBuilder()
          .setConfig(AudioConfig.newBuilder()
                           .setEncoding(AudioEncoding.LINEAR_PCM)
                           .setSampleRateHertz(16000)
                           .setLevel(Level.utterance)
                           .build())
          .setCid(cid)
          .setXAuthToken(token)
          .build();
requestObserver.onNext(config);

The rest of the messages include the audio bytes. Here we send chunks of 250ms:

// Stream the contents of the file
byte[] buffer = new byte[8000]; // 250ms of audio
int bytesRead;
while ((bytesRead = convertedStream.read(buffer)) != -1) {
  AudioStream content = AudioStream.newBuilder()
            .setCid(cid)
            .setXAuthToken(token)
            .setAudioContent(ByteString.copyFrom(buffer, 0, bytesRead))
            .build();
  requestObserver.onNext(content);
}

Finally we can gracefully close the connection once we are done:

// Wait till all messsages have been processed and finally close stream gracefully
requestObserver.onCompleted();
while (isConnectionActive.get()) Thread.sleep(1000);
channel.shutdownNow();

The complete App.java file looks like this:

package com.yourcompany.grpc;

import com.google.protobuf.ByteString;
import com.yourcompany.grpc.api.proto.*;
import io.grpc.ManagedChannel;
import io.grpc.ManagedChannelBuilder;
import io.grpc.stub.StreamObserver;

import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.UnsupportedAudioFileException;
import java.io.IOException;
import java.nio.file.Paths;
import java.util.concurrent.atomic.AtomicBoolean;

/**
 * Hello world!
 */
public class App {
    public static void main(String[] args) throws UnsupportedAudioFileException, IOException, InterruptedException {
        // Parse arguments
        if (args.length < 3) {
            System.err.println("Usage: java -jar your-app.jar <cid> <token> <path/to/audio.wav>");
            System.exit(1);
        }

        int cid = Integer.parseInt(args[0]);
        String token = args[1];
        String filePath = args[2];

        // Define the grpc channel
        ManagedChannel channel = ManagedChannelBuilder.forTarget("streaming.behavioralsignals.com:443")
                .useTransportSecurity()
                .build();

        // Create the stub service
        BehavioralStreamingApiGrpc.BehavioralStreamingApiStub stub = BehavioralStreamingApiGrpc.newStub(channel);
        final AtomicBoolean isConnectionActive = new AtomicBoolean(true);

        // Open deepfake detection stream and define the response handler
        StreamObserver<AudioStream> requestObserver = stub.deepfakeDetection(new StreamObserver<StreamResult>() {
            @Override
            public void onNext(StreamResult result) {
                System.out.println("Received result: " + result);
                // Your custom response handling logic goes here
            }

            @Override
            public void onError(Throwable t) {
                System.err.println("Stream error: " + t.getMessage());
                isConnectionActive.set(false);
            }

            @Override
            public void onCompleted() {
                System.out.println("Server closed stream.");
                isConnectionActive.set(false);
            }
        });


        // Define the target format of the audio
        AudioFormat targetFormat = new AudioFormat(
                AudioFormat.Encoding.PCM_SIGNED,
                16000.0f,     // Sample rate
                16,           // Sample size in bits
                1,            // Channels (mono)
                2,            // Frame size (2 bytes = 16-bit mono)
                16000.0f,     // Frame rate
                false         // Little endian
        );
        AudioInputStream originalStream = AudioSystem.getAudioInputStream(Paths.get(filePath).toFile());
        // Resample and downmix
        AudioInputStream convertedStream = AudioSystem.getAudioInputStream(targetFormat, originalStream);


        // The first message is always the config.
        AudioStream config = AudioStream.newBuilder()
                .setConfig(AudioConfig.newBuilder()
                        .setEncoding(AudioEncoding.LINEAR_PCM)
                        .setSampleRateHertz(16000)
                        .setLevel(Level.utterance)
                        .build())
                .setCid(cid)
                .setXAuthToken(token)
                .build();
        requestObserver.onNext(config);

        // Stream the contents of the file
        byte[] buffer = new byte[8000]; // 250ms of audio
        int bytesRead;
        while ((bytesRead = convertedStream.read(buffer)) != -1) {
            AudioStream content = AudioStream.newBuilder()
                    .setCid(cid)
                    .setXAuthToken(token)
                    .setAudioContent(ByteString.copyFrom(buffer, 0, bytesRead))
                    .build();
            requestObserver.onNext(content);
        }

        // Wait till all messsages have been processed and finally close stream gracefully
        requestObserver.onCompleted();
        while (isConnectionActive.get()) Thread.sleep(1000);
        channel.shutdownNow();
    }
}

Step 5: Run the application

Compile and run the application with the command below. Make sure to modify cid, token and filePath with your own:

mvn clean package
java -jar target/behavioralsignals-grpc-client-1.0-SNAPSHOT.jar <your-cid> <yout-api-token> /path/to/audio.wav

The output of the app should look like this:

Received result: cid: <your-cid>
pid: 8
result {
  id: "14"
  start_time: "0.00"
  end_time: "9.00"
  task: "deepfake"
  prediction {
    label: "bonafide"
    posterior: "0.995"
    logit: ""
  }
  prediction {
    label: "spoofed"
    posterior: "0.005"
    logit: ""
  }
  final_label: "bonafide"
  embedding: ""
  level: utterance
}

Received result: cid: <your-cid>
pid: 8
message_id: 1
result {
  id: "24"
  start_time: "10.00"
  end_time: "15.00"
  task: "deepfake"
  prediction {
    label: "bonafide"
    posterior: "0.9991"
    logit: ""
  }
  prediction {
    label: "spoofed"
    posterior: "0.0009"
    logit: ""
  }
  final_label: "bonafide"
  embedding: ""
  level: utterance
}

Server closed stream.


🐍 Using Python SDK

With our SDK you can easily start a deepfake detection stream as such:

from behavioralsignals import Client, StreamingOptions
from behavioralsignals.utils import make_audio_stream

cid = "<your-cid>"
token = "<your-api-token>"
file_path = "/path/to/audio.wav"

client = Client(cid, token)
audio_stream, sample_rate = make_audio_stream(file_path, chunk_size=250)
options = StreamingOptions(sample_rate=sample_rate, encoding="LINEAR_PCM")
for result in client.deepfakes.stream_audio(audio_stream=audio_stream, options=options):
     print(result)

The result is a StreamingResultResponse object defined here.

🕐 Retrieving results from past streams

You can always retrieve results from past streams by using the get results endpoint or the SDK, with the pid that corresponds to the desired stream.