Required Skills: Senior NVIDIA CUDA Engineer / GPU Software Engineer
Job Description
Position:: Senior NVIDIA CUDA Engineer / GPU Software Engineer
Location: Waukesha, WI
Job Summary
We are seeking an experienced NVIDIA CUDA Engineer to design, develop, optimize, and maintain GPU-accelerated software for high-performance, real-time, and parallel computing environments. The role involves deep work across CUDA programming, GPU architecture, system-level software, and performance optimization, supporting domains such as embedded systems, medical devices, automotive, AI/ML, HPC, and quantum-classical hybrid computing.
This position requires close collaboration with hardware architects, system engineers, and research teams to deliver scalable, production-grade GPU solutions.
Key Responsibilities
1. GPU Programming & Performance Optimization
• Design, develop, and optimize CUDA-based algorithms for high-performance computing workloads.
• Tune GPU kernels for maximum throughput, low latency, and memory efficiency.
• Optimize memory hierarchies (shared, global, unified memory) and kernel launch configurations.
• Implement parallel, asynchronous, and distributed computation strategies across CPU/GPU/FPGA systems.
2. Systems & Software Development
• Develop and maintain CUDA runtime libraries, drivers, and toolchain components.
• Work on multi-processor execution models, GPU memory management, and synchronization mechanisms.
• Perform low-level debugging, profiling, and system-level optimization for NVIDIA GPU platforms.
• Support integration of CUDA components into large-scale production systems.
3. Cross-Functional Collaboration
• Collaborate with hardware engineers, GPU architects, and platform teams to co-design efficient GPU solutions.
• Partner with research and AI teams on real-time algorithms, AI workload acceleration, and CUDA-Q (quantum-classical) frameworks.
• Provide technical guidance and mentorship to junior engineers.
4. CI/CD, Testing & Quality Assurance
• Improve and maintain CI/CD pipelines for CUDA-based software components.
• Perform benchmarking, regression testing, and performance validation across software releases.
• Ensure robustness, scalability, and production readiness of GPU software.
• Participate in Unified Functional Testing (UFT) where applicable.
Required Skills & Qualifications
Mandatory Technical Skills
• Strong C/C++ programming expertise (mandatory).
• 6–10+ years of hands-on experience with CUDA and GPU software development.
• Deep understanding of NVIDIA GPU architecture, memory models, and execution pipelines.
• Expertise in GPU performance tuning and profiling tools (e.g., Nsight, nvprof).
• Strong knowledge of parallel programming paradigms (multi-threading, SIMD, vectorization).
• Experience with heterogeneous computing environments (CPU/GPU/FPGA).
Preferred / Good-to-Have Skills
• Experience with High-Performance Computing (HPC) systems.
• Exposure to compiler technologies such as LLVM / MLIR.
• Knowledge of distributed systems and multi-node GPU workloads.
• Familiarity with real-time algorithms and AI/ML acceleration.
• Exposure to CUDA-Q or quantum-classical hybrid computing.
• Experience with Unified Functional Testing (UFT) frameworks.
Desirable Attributes
• Strong problem-solving and performance analysis skills.
• Ability to work on low-level, performance-critical software.
• Experience building robust, scalable, production-grade systems.
• Excellent communication skills and ability to work in cross-functional teams.