Machine Learning Engineer — Multilingual Data
mid
via Ashby
About this role
We’re looking for a Machine Learning Engineer to own and scale our multilingual data pipeline—from sourcing and curation to evaluation and continuous improvement. You’ll work closely with researchers and infra engineers to ensure our models perform robustly across languages, scripts, and cultural contexts.
This role sits at the intersection of data, research, and production ML and is ideal for someone who cares deeply about data quality, linguistic diversity, and model generalization beyond English.
WHAT YOU’LL DO
- Design, build, and maintain large-scale multilingual datasets across high- and low-resource languages
- Develop data pipelines for collection, cleaning, normalization, deduplication, and labeling…
What we'd score you on
reqspace match rubricFive dimensions, recruiter-grade. Upload your resume and we'll generate a written explanation of where you fit and where the gaps are.
1
Skills match
For this role: python, spark, ray
2
Level fit
This role is mid-level. We check your trajectory against it.
3
Domain experience
Your work in the role's domain matters more than your years total. We weight recent and direct experience.
4
Recency
A skill you used last quarter weighs more than one from five years ago. We grade on recency, not lifetime.
5
Location fit
This role is based in a specific location. We weight your proximity and willingness to relocate.
Score yourself on this role.
Free · no card · written explanation included
Skills in this role
Pulled from the job description. These are the keywords we'll weight when scoring your fit.
pythonsparkray
