Looking for Expert NLP/ML Engineer for Language Translation Model Training (Indic Languages)

Remote Full-time
Project Description: I am looking to hire an experienced NLP/ML engineer to train high-quality machine translation models for Indic languages. The goal is to develop single language-pair models, such as: ● English → Telugu ● English → Hindi (and additional language pairs, if needed) You may choose the most suitable model architecture based on your expertise (e.g., mBART, mT5, NLLB fine-tuning, Transformer variants, etc.), as long as the final models deliver strong translation quality. Dataset: ● You can use the AI4Bharat datasets including: ● Samanantar ● BPCC ● Other open Indic parallel corpora Scope of Work: The freelancer will be responsible for: 1. Data Handling ● Cleaning, filtering, and preprocessing datasets Sentence alignment (if needed) ● Tokenization and vocabulary preparation (SentencePiece/BPE/etc.) 2. Model Training ● Selecting an appropriate model architecture ● Training single language-pair translation models ● Implementing best practices for training efficiency (FP16, gradient accumulation, etc.) ● Hyperparameter tuning Checkpoint management and monitoring 3. Evaluation ● Compute BLEU, SacreBLEU, and other relevant metrics ● Provide side-by-side qualitative translation samples ● Benchmarking against baseline models 4. Delivery ● Final trained model weights ● Inference scripts (Python) for quick testing ● Instructions for running and continuing training ● Documentation of preprocessing and training pipeline ● Optional: Dockerfile or virtual environment setup Requirements: The ideal candidate should have: ● Strong experience in NLP, Transformers, and neural MT models ● Prior work with Indic languages (big plus) ● Experience with training libraries such as PyTorch, Hugging Face Transformers, Fairseq, OpenNMT, or similar ● Ability to handle large-scale training and dataset preprocessing ● Familiarity with SentencePiece, tokenization strategies, and MT evaluation metrics ● Ability to deliver clean, well-documented code Additional Notes: ● Compute resources can be discussed (I can provide compute, or you can use yours). ● More language pairs may be added later as separate follow-up projects. ● Quality of translation is the highest priority. Apply tot his job
Apply Now →

Similar Jobs

Freelance Writer: Politics and Trending News at GAMURS Group

Remote

Junior AI/NLP/Machine Learning Engineer 2

Remote

[Remote] Senior Account Manager, Nordstrom Media Network (Remote)

Remote

Professional Services Engineer - Network Security Vendor

Remote

Trending News Writer & Editor, Soccer - Sports Illustrated FC

Remote

Overnight Inpatient Pharmacy Technician - IP 500P - (Part-Time, 10-Hour Night Shifts)

Remote

Customer Service Representative (Guam Night Shift)

Remote

Live Chat Assistant - Remote - Night Shift Premium - $25-$35/hr

Remote

Senior Principal, Stakeholder Engagement, Global Sustainability

Remote

[PART_TIME Remote] Nike Data Entry Remote Jobs $27/Hour

Remote

**Experienced Customer Care Specialist (Remote) - Travel Industry Expert**

Remote

Experienced Part-Time Customer Service Representative – Remote Work Opportunity for Exceptional Client Support

Remote

Experienced Part-Time Data Entry Specialist for Remote Opportunities with Strong Attention to Detail and Organizational Skills

Remote

Clinical Systems Analyst

Remote

Cloud Operations Engineer

Remote

Remote Audiologist Diagnostics

Remote

Quality Talent Group is hiring: Data Analyst (Remote) in South Houston

Remote

Experienced Customer Support Representative – Remote Work Opportunity with arenaflex for Delivering Exceptional Customer Experiences

Remote

**Experienced Remote Customer Service Representative – Delivering Exceptional Experiences for arenaflex Customers**

Remote

Experienced Remote Data Entry arenaflex Specialist – Customer Service and Online Support Professional – Part-Time Opportunity with Comprehensive Training

Remote
← Back