Machine Learning Platform R&D Expert Engineer (Parameter Server Direction) - EGO Team
Philippines
Permanent
Full-time
6 days ago
Job Description: Develop distributed Parameter Server (PS) systems for large-scale sparse model training and inference platforms in the search, advertising, and recommendation domains. The system should support high-throughput parameter read/write and update operations, handle hundreds of billions of features and TB-level sparse models, enable online real-time learning, and meet algorithmic needs such as feature admission and expiration. Participate in the development of the one-stop machine learning platform, integrating the PS system into the platform to provide a user-friendly, stable, high-performance, and platform-level distributed parameter service system. Enhance the platform's efficiency and usability, accelerating the model iteration process for algorithm teams. Requirements: Bachelor's degree or above in Computer Science, Electronics, Automation, Software Engineering, or related fields At least 3 years of relevant hands-on experience Proficient in C++ programming with strong low-level technical skills adept at multi-threaded programming, lock optimisation, memory pool, thread pool, template programming, GDB debugging, performance tuning, and RPC frameworks. Familiarity with distributed PS systems, distributed system backend optimization, high-performance in-memory KV systems, KV storage systems based on NVMe-SSD, and high-performance client-server architecture systems is a plus. Highly passionate about computer technology, proactive in learning, with a strong spirit of in-depth research and hands-on practice. Maintains high standards and strict requirements for delivered code works with rigor and attention to detail. Strong team player with excellent continuous learning ability.