Staff Engineer, Platform San Francisco, CA
Company: Amplitude
Location: San Francisco
Posted on: November 11, 2024
Job Description:
Amplitude is a leading digital analytics platform that helps
companies unlock the power of their products. More than 3,200
customers, including Atlassian, Jersey Mike's, NBCUniversal,
Shopify, and Under Armour, rely on Amplitude to gain self-service
visibility into the entire customer journey. Amplitude guides
companies every step of the way as they capture data they can
trust, uncover clear insights about customer behavior, and take
faster action. When teams understand how people are using their
products, they can deliver better product experiences that drive
growth.As an organization, we approach challenges with humility,
take ownership of our contributions, and embrace a growth mindset
that pushes us to constantly improve ourselves, each other, and the
value we bring to customers and partners.Amplitude's Commitment to
Diversity Equity & Inclusion (DEI): Amplitude believes that
diversity enables the creation of better products, improves the
ability to solve complex problems, and drives more powerful
solutions. We strive to create an environment of inclusion-one
focused on psychological safety, empathy, and human connection-that
will allow employees of all backgrounds to thrive.About the Role:We
are looking for a highly experienced and collaborative Staff
Platform Engineer to join our team. You will be responsible for the
design, automation, and optimization of our Kubernetes-based
platform, ensuring that it is scalable, easy to use, and reliable.
This is a critical role for our company, and we are seeking someone
who not only has deep technical expertise in cloud infrastructure
and Kubernetes but also values mentoring, collaboration, and open
communication. Your work will directly impact our developer
productivity by building systems and abstractions that simplify the
deployment of new workloads, making it easy for developers to focus
on building features, not infrastructure.In this role, you will
help drive a cultural shift in how our Platform team operates,
working to create positive relationships across the company,
building trust, and making our platform easier to use. We value
someone who listens, learns, and communicates effectively while
still ensuring high technical standards and reliability.Key
Responsibilities:
- Lead the design, implementation, and management of our
Kubernetes-based platform, focusing on scalability, developer
experience, and system reliability.
- Architect and maintain automation around Kubernetes, ensuring
that the platform is easy for developers to use and requires
minimal toil to deploy or modify workloads in a self-service
model.
- Collaborate with cross-functional teams (developers, leaders,
and other infrastructure teams) to gather requirements, build
consensus, and deliver impactful solutions.
- Integrate observability into the platform, using tools like
Datadog, Prometheus, Grafana, New Relic, and Splunk to monitor
system health and performance.
- Drive infrastructure-as-code initiatives using tools like
Kubernetes Operators, Helm, Kustomize, and Terraform promoting
automation, repeatability, and reliability.
- Ensure that the platform integrates seamlessly with CI/CD
pipelines (using Argo CD / Workflows / Rollouts, Github Actions,
Jenkins, or similar) and continuously improve developer
workflows.
- Contribute to the operational excellence of the platform,
including on-call responsibilities and incident management, while
building self-healing capabilities where possible.
- Act as a mentor to other engineers on the team, promoting
growth and knowledge sharing, ensuring that the team thrives even
in the absence of specific individuals.
- Foster a culture of collaboration, empathy, and trust within
the team and across departments, helping to bridge gaps between
engineering and other business functions.
- Take a hands-on approach to problem-solving, sometimes
submitting PRs to resolve issues in codebases or providing detailed
solutions when teams need assistance.What We're Looking For:
- 8+ years of experience in some combination of cloud-native
software development, platform engineering, site reliability
engineering, and/or cloud infrastructure, with a more recent focus
on Kubernetes and the cloud-native ecosystem.
- Strong expertise in Kubernetes and related CNCF projects (e.g.,
Argo CD/Workflows, Backstage, Envoy, CoreDNS, and more) and in
simplifying complex cloud infrastructure for broader teams.
- Operational experience at scale with technologies like Kafka
and Airflow.
- Proficient in common infrastructure languages like Golang,
Python, and Terraform, with experience developing and operating
production systems.
- Extensive experience with AWS cloud infrastructure, networking,
and security.
- Proven experience with monitoring and observability tools
(Datadog, Splunk, Prometheus, Grafana Cloud, etc.) and a strong
understanding of system performance tuning.
- Expertise in building abstractions over Kubernetes to simplify
developer interaction with the platform.
- Excellent communication skills, with the ability to collaborate
across teams, build consensus, and drive initiatives in a
high-pressure environment.
- High level of empathy and patience, with a commitment to
mentoring and helping others succeed, and the ability to
incorporate feedback and turn it into actionable improvements.
- Experience with infrastructure-as-code and automation
(Terraform, Helm, Kustomize, etc.), with a focus on reducing toil
and operational overhead.
- A mindset focused on improving the developer experience and
business alignment, with the flexibility to make decisions that may
go against ideal technical preferences when necessary.
#J-18808-Ljbffr
Keywords: Amplitude, Tracy , Staff Engineer, Platform San Francisco, CA, Engineering , San Francisco, California
Didn't find what you're looking for? Search again!
Loading more jobs...