Senior Systems Software Manager - TAO Build, Automation and Release
Company: NVIDIA Corporation
Location: Santa Clara
Posted on: November 8, 2024
Job Description:
Senior Systems Software Manager - TAO Build, Automation and
Release NVIDIA is hiring a Senior Systems Software Manager for
Build, Automation, Release, Optimizations to join the TAO Toolkit
Deep Learning Architectures team. The toolkit encompasses scalable
and easy-to-use modules for training, fine-tuning, and optimization
for Computer Vision and Multi-Modal AI, to help advance the state
of the art while improving performance. If you have a passion for
pioneering technologies and a commitment to developing scalable,
optimized, and ethical AI, we invite you to join our strong team at
NVIDIA.In this role, you will lead and supervise the development,
implementation, and optimization of continuous integration,
continuous deployment (CI/CD) pipelines, and release management
processes. This role is critical in ensuring the efficient and
reliable delivery of software solutions. The ideal candidate will
bring a deep understanding of modern DevOps practices, including
automation, orchestration, and infrastructure along with leadership
experience to drive a high-performing engineering team.What you'll
be doing:
- Lead a team of developers to improve CI/CD tools
integration/operations, and full automation of CI/testing.
- Lead efforts to resolve production issues and implement
necessary integrations.
- Lead the ongoing design, implementation, and preservation of
systems and tools across the toolkit stack.
- Design, implement, and manage cloud infrastructure for
continuous integration, delivery, and deployment.
- Partner with a multi-functional team including engineering,
product, QA to improve development workflows, reduce bottlenecks,
handle and minimize risks, and enhance software delivery speed and
quality.
- Lead the development of robust processes to write and maintain
documentation infrastructure.
- Communicate effectively with technical and non-technical
partners to set shared expectations and ensure visibility around
the release and deployment process.
- Collaborate with diverse software, research, and hardware teams
across geographies to analyze the interplay of hardware and
software architectures to solve critical problems and future
applications.What we need to see:
- Bachelor's/Master's degree or equivalent experience in Computer
Science, Information Systems, Engineering, or other related
fields.
- 8+ overall years of proven experience in software engineering,
DevOps, or release management, with at least 3 years of leadership
experience or managerial role.
- Proven experience with automation and orchestration tools
including Jenkins, Bazel, Gitlab, Docker, Kubernetes.
- Strong expertise in cloud platforms like AWS, Azure, GCP, or
others.
- Proven experience in developing production-quality software
pipelines for AI, computer vision, or multi-modal algorithms,
especially with LLMs and Multi-Modal Foundation models.
- Expertise in release management, version control systems, and
configuration management.
- Strong programming skills in Python and/or C++, and experience
developing integrated AI solutions.
- Proven track record to lead projects, manage timelines, and
deliver results in an Agile/Scrum environment.
- Strong analytical and problem-solving skills with a focus on
practical and scalable AI solutions.
- Strong interpersonal skills and ability to work in a
collaborative environment.Ways to stand out from the crowd:
- Knowledge of tools like Ansible, Terraform, and Puppet for
automating repetitive tasks and infrastructure provisioning.
- Proven experience in automating the building and deploying of
software around AI infrastructure.
- Experience with security practices and trustworthy AI.
- Background with NVIDIA SDKs such as TensorRT, RAPIDS, CUDA, and
CUDNN.NVIDIA is widely considered to be one of the technology
industry's most desirable employers. We have some of the most
forward-thinking and hard-working people working with us and our
engineering teams. If you're a creative engineer with a real
passion for building scalable and robust infrastructure, we want to
hear from you.The base salary range is 272,000 USD - 419,750 USD.
Your base salary will be determined based on your location,
experience, and the pay of employees in similar positions. You will
also be eligible for equity and benefits.NVIDIA accepts
applications on an ongoing basis.NVIDIA is committed to fostering a
diverse work environment and proud to be an equal opportunity
employer. As we highly value diversity in our current and future
employees, we do not discriminate (including in our hiring and
promotion practices) on the basis of race, religion, color,
national origin, gender, gender expression, sexual orientation,
age, marital status, veteran status, disability status, or any
other characteristic protected by law.
#J-18808-Ljbffr
Keywords: NVIDIA Corporation, Tracy , Senior Systems Software Manager - TAO Build, Automation and Release, Executive , Santa Clara, California
Didn't find what you're looking for? Search again!
Loading more jobs...