Data Engineer
Sunset
Software Engineering, Data Science
New York, NY, USA
USD 180k-280k / year + Equity
Location
New York
Employment Type
Full time
Department
Engineering
Compensation
- $180K – $280K • Offers Equity
About Sunset
Sunset is building the data layer for real-world AI training. We work with frontier labs to turn messy, multi-modal enterprise data into the highest-quality training data on the market — sourced from the hundreds of venture-backed startups we've helped wind down.
We're a fast-growing team based in-person in Dumbo, Brooklyn. Backed by Floodgate, Afore Capital, Hustle Fund, and incredible entrepreneurs.
The Role
As a Data Engineer at Sunset, you'll own the pipeline that turns raw, chaotic enterprise data into the highest-quality training data on the market. One of our core technical challenges is entity resolution and de-identification across different sources and modalities. An even deeper challenge is understanding the node structures and linkages well enough to effectively reconstruct the business world this data comes from. All of this happens on sensitive data, which means security and privacy aren't a separate workstream but are built into every pipeline, system, and decision we make.
What You'll Work On
You'll own problems end-to-end. Some examples of what you might tackle in your first 90 days:
Designing the de-identification layer that replaces PII with stable pseudonyms while preserving every relationship across every source
Building coreference resolution across Slack threads, email chains, and Linear comments so that "me," "him," and first-name mentions all resolve to the right canonical entity
Hardening how we ingest, store, and process sensitive client data — from encryption and access controls to audit trails and isolation boundaries
Extending our entity resolution pipeline to handle new modalities — think audio, video, design files, or embedded references inside documents
You Might Be a Fit If
You are a product minded engineer and have shipped data pipelines at scale
You have strong Python and are comfortable across NER, record linkage, and coreference
You take security and privacy seriously and have built systems where getting it wrong wasn't an option
You want to own a hard, ambiguous problem end-to-end rather than wait for a PRD
AI is deeply integrated into your workflow and life
This Role Might Not Be a Fit If
You want to work remote or hybrid — we're in-person 5 days/week in Dumbo
You want to do novel ML research — this role is applied, not research
You prefer long planning cycles or narrow ownership
Our Stack
Python, Postgres, Redis, AWS. We pick tools based on the problem, not the other way around.
Compensation & Benefits
$180K–$280K base + meaningful equity
100% covered medical, dental, and vision
Unlimited PTO
$500 in-office setup allowance
How We Hire
Intro Chat (20 min) – mutual fit and interests
Technical Session (1hr) – collaborative problem-solving
Onsite (2–3 hrs) – product deep dive, system design, meet the team
Quick references → Offer
Compensation Range: $180K - $280K