About The Company Turing, headquartered in San Francisco, California, is recognized as the world's leading research accelerator dedicated to frontier artificial intelligence labs. As a trusted partner for global enterprises, Turing specializes in deploying advanced AI systems that drive innovation and operational excellence. The company supports its clients through two primary avenues: first, by accelerating frontier research utilizing high-quality data, sophisticated training pipelines, and top-tier AI researchers with expertise in coding, reasoning, STEM disciplines, multilinguality, multimodality, and autonomous agents; second, by translating this cutting-edge research into proprietary AI systems that are reliable, impactful, and capable of delivering measurable results that positively influence the company's bottom line. Turing's commitment to excellence and innovation positions it at the forefront of AI research and enterprise transformation. About The Role We are seeking experienced Data Analysts specializing in Machine Learning Evaluation Benchmarks (MLE Bench) to join our dynamic team. In this role, you will be instrumental in conducting benchmark-driven evaluation projects that assess real-world machine learning systems. Your primary responsibility will involve hands-on analysis of production-like datasets, metrics, and ML outputs to evaluate, diagnose, and enhance the performance of advanced AI models. This position offers a unique opportunity to work at the intersection of data analysis and machine learning, contributing to the development and refinement of evaluation frameworks that ensure AI systems perform reliably and effectively in real-world scenarios. Qualifications The ideal candidate will bring a minimum of three years of professional experience as a Data Analyst or an analytics-focused engineer. You should demonstrate strong proficiency in Python, especially for data analysis tasks, and possess solid experience working with SQL and relational databases. Experience analyzing ML outputs, evaluation metrics, and understanding statistical concepts is essential. Candidates must be comfortable working with large, complex datasets and possess the ability to extract reliable insights through analytical reasoning. Excellent communication skills in English, both spoken and written, are required to effectively collaborate with cross-functional teams. A proven track record of writing clean, well-documented, and reproducible analytical code will be highly valued. Responsibilities Analyze structured and unstructured datasets generated from machine learning training, inference, and evaluation pipelines to identify patterns, anomalies, and insights. Define, compute, and validate evaluation metrics that measure model performance, robustness, and behavior across various benchmarks. Investigate data distributions, model outputs, and failure modes, especially in edge cases relevant to benchmark tasks, to inform model improvements. Develop and execute Python and SQL scripts to support data analysis, generate comprehensive reports, and facilitate evaluation workflows. Validate data quality, consistency, and accuracy across multiple datasets and experimental setups to ensure reliable results. Create clear, well-structured analytical artifacts and documentation to support reproducibility and knowledge sharing within the team. Collaborate closely with ML engineers and researchers to design challenging, real-world evaluation scenarios that push the boundaries of current AI systems and benchmarks. Benefits Joining Turing as a freelancer offers the flexibility of working in a fully remote environment, allowing you to manage your work-life balance effectively. You will have the opportunity to contribute to cutting-edge AI projects alongside leading LLM companies, gaining exposure to the latest advancements in artificial intelligence. The role provides an engaging and intellectually stimulating environment where your expertise directly influences the development and evaluation of high-performance AI systems. Additionally, Turing offers competitive compensation and the chance to expand your professional network through collaboration with top-tier AI professionals and organizations worldwide. Equal Opportunity Turing is committed to fostering an inclusive and diverse workplace. We provide equal employment opportunities to all applicants regardless of race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, disability, or any other protected status. We believe that diverse perspectives and backgrounds drive innovation and excellence, and we are dedicated to creating an environment where everyone can thrive and contribute meaningfully to our mission.

Find Remote Jobs That Hire Worldwide

Data Analyst

About this role

Job Details