Infrastructure Reliability Engineer
Whop
About Whop
Whop is the platform powering the next generation of online business. We provide entrepreneurs with everything they need to launch, grow, and monetize digital products: global payment infrastructure, flexible storefronts, community engagement tools, and marketplace distribution.
We've paid out over $2.5 billion to thousands of businesses and built a user base of more than 15 million.
Whop has raised nearly $80 million from world-class investors including Bain Capital, Insight Partners, Peter Thiel, Justin Kan (Twitch), Justin Mateen (Tinder), the Chainsmokers, and others. Our team includes talented leaders from top fintech and technology companies like Meta, Robinhood, and Quora, as well as many experienced founders.
About the role
When millions of people run their businesses on what you ship, downtime is not an option. Reliability is a core tenet of our infra team. You’ll be building the systems that allow us to scale another 100x while maintaining the reliability that our users’ livelihoods depend on. You’ll ship real outcomes with judgment, taste, and a steady hand under pressure.
What you'll do:
- Own uptime for our infrastructure
- Upgrade our observability and monitoring stack to better understand system health
- Improve systems around error collection and real-time alerting
- Design staged rollout and rollback systems to improve safety of production deployments
- Harden our infrastructure against real-world threats and implement guardrails that keep production safe
- Speed up performance of critical systems: better infra, code optimization, caching, etc.
- Work with product teams to make development faster and safer
- Create standards around real-time event collection for use in insights and experimentation
What you need
- Minimum of 5 years experience operating production infrastructure for high-availability systems
- Deep observability experience including metrics, logs, tracing, and alerting
- Security engineering competence
- Excellence with databases, queues, and caching strategies
- Strong computing and networking fundamentals
- Experience with deploying production software safely
- Ability to work in a fast-paced startup environment with imperfect information
- Strong debugging instincts under pressure
Nice to have
- Payments/fintech experience
- Experience designing SLO programs end-to-end (error budgets, alert routing, paging policy)
- Experience with infrastructure as code
- Strong programming ability in a backend language
- Take ownership from the root cause through permanent fix
Benefits Overview
- Minimum cash comp of $250,000K + a competitive equity package
- Unlimited PTO, with full health, vision, dental coverage
- Free access to Equinox
- $9k annual rent stipend if you live within 4 blocks of the Brooklyn office
- Lunch & dinner paid for Monday thru Friday
- The latest Macbook Pro & tech accessories