Back to Home

Ultrasound beamformer for a prototype GPU platform

Ultrasound beamformer for a prototype GPU platform

The Problem

The client’s legacy CPU-based beamformer was unable to keep up with the data rate of their new transducer array. Frame rates were dropping below 15fps, making the system unsuitable for cardiac imaging applications. They needed a solution that could handle 4x the data throughput without changing the physical hardware footprint.

Our Approach

We conducted a 2-week assessment to profile the existing C++ codebase and identify parallelizable bottlenecks. Following the assessment, we executed a 6-week sprint to:

  • Port critical delay-and-sum kernels to CUDA
  • Implement zero-copy memory transfer to minimize latency
  • Build a bit-exact validation suite against the CPU reference

Results

12x
speedup vs CPU
60fps
sustained frame rate

The optimized GPU pipeline not only met the performance requirements but freed up CPU resources for post-processing and UI rendering. The solution was delivered with full documentation and a regression test suite.

Facing a similar challenge?

Let's discuss how we can help you achieve similar performance gains.

Request an assessment