- Mastery over Open-source and Commercial HPC/AI Applications.
- Deep experience installing, benchmarking, and fine-tuning open-source applications, libraries, and compilers across CPU and GPU platforms.
- Proficient deploying and optimizing and benchmarking scientific codes (WRF, OpenFOAM, LAMMPS, GROMACS, Quantum Espresso, VASP, NAMD, BLAST, GATK, Ansys, Abaqus, MATLAB, LS‑DYNA, Nastran, CAE/CFX) etc.
- Compiler & Library Optimization - Advanced user of Intel OneAPI, AOCC, NVIDIA HPC SDK, GNU, LLVM, PGI compilers, and MPI libraries (OpenMPI, MPICH, Intel MPI). Deep profiling insights via Nsight, VTune, PAPI.
- Expert in AI frameworks: TensorFlow (CPU/GPU), PyTorch, Keras, Theano, Caffe, cuDNN. Strong knowledge of NVIDIANGC, NIM & NeMo.
- Proficient with workload & resource managers (PBS, LSF, SLURM, Kubernetes).
- Knowledge of application installation tools source code, cmake, spack, easy build, mamba etc.
- Benchmarking experience in accelerated HPC: HPL, HPCG, STREAM and MLPerf and scientific applications.
- Skilled in NVIDIA GPU tuning, CUDA and NIM workflows, kernel optimization, memory throughput tuning, and multi-GPU scaling strategies.
- Knowledge of frameworks such as Hugging Face, OpenAI, or other GenAI platforms.
- Knowledge in data preprocessing and model evaluation tool.
- Fluent in Bash, Python, and other scripting languages to automate installation, deployment, performance testing, and administrative tasks.
- Strong interpersonal skills; versed in customer interaction, technical documentation, and collaboration with cross-functional teams.