2,474 Open roles
96 Companies
52 Posted today
Jobs / Tencent Games / Sr. Cloud AI Infrastructure Engineer
This job is no longer available.

This position has been closed.

Posted 2026-05-22

Sr. Cloud AI Infrastructure Engineer

Description

Conduct in-depth research into the underlying hardware logic of various AI accelerators; evaluate the power-efficiency ratio and suitability of different heterogeneous architectures in the context of Large Language Model (LLM) inference and training. Design and optimize high-performance operator libraries for large-scale cloud computing environments; resolve long-tail latency issues in hardware scheduling, memory management, and distributed communication. Define the interconnect architecture; drive the virtualization, standardized access, and efficient pooling of heterogeneous computing resources in the cloud. Monitor global trends in semiconductors and accelerators; perform feasibility studies and experimental validation for the implementation of emerging technologies within cloud infrastructure.

Responsibilities
  • Architecture Research: Conduct in-depth research into the underlying hardware logic of various AI accelerators; evaluate the power-efficiency ratio and suitability of different heterogeneous architectures in the context of Large Language Model (LLM) inference and training.
  • Operator & Performance Optimization: Design and optimize high-performance operator libraries for large-scale cloud computing environments; resolve long-tail latency issues in hardware scheduling, memory management, and distributed communication.
  • Interconnect Architecture Definition: Define the interconnect architecture ; drive the virtualization, standardized access, and efficient pooling of heterogeneous computing resources in the cloud.
  • Technology Trend Analysis: Monitor global trends in semiconductors and accelerators; perform feasibility studies and experimental validation for the implementation of emerging technologies within cloud infrastructure.
Requirements
  • Master’s or Ph.D. degree in Computer Engineering, Electronic Engineering, Microelectronics, or a related field.
  • Expertise in GPGPU architectures or other mainstream AI accelerator architectures.
  • Proficient in parallel computing frameworks; deep understanding of low-level operator development languages (e.g., CUDA, Triton).
  • Solid understanding of large-scale distributed systems, cluster topologies (e.g., Fat-tree, Torus), and high-performance network protocols.
  • Familiar with the architectural evolution of global leading computing enterprises; ability to objectively analyze the technical pros/cons and engineering challenges of different architectural paths.
  • Experience in the application, optimization, or architectural design of ultra-large-scale accelerator clusters is preferred.
  • Experience in the low-level adaptation and performance tuning of mainstream deep learning frameworks (e.g., PyTorch, TensorFlow) is preferred.
Benefits
  • Sign on payment (case-by-case basis)
  • Relocation package (case-by-case basis)
  • Restricted stock units (case-by-case basis)
  • Medical, dental, vision, life and disability benefits
  • Participation in the Company’s 401(k) plan
  • 15 to 25 days of vacation per year (depending on tenure)
  • 13 days of holidays throughout the calendar year
  • 10 days of paid sick leave per year
Similar Active Jobs
IGTProduct & DevelopmentBelgrade, Serbia

Technical Artist

IGT is seeking a Technical Artist in Belgrade to bridge the gap between art and technology in the production of casino games. The role involves implementing 3D assets and animations in Unity while collaborating with international cross-functional teams. Candidates must possess strong technical skills in Unity and Adobe Creative Suite, along with a relevant portfolio of slot or casino artwork.

HybridFull-timeMid-level3 yearsEnglish
2026-07-02
SportradarProduct & DevelopmentVienna, Austria

Senior Application Specialist [m/f/d]

Sportradar is seeking a Senior Application Specialist to take technical ownership of Dynamics 365 F&O and connected financial systems. This role supports strategic initiatives within Finance systems by collaborating with the finance department and stakeholders to deliver customised solutions and enhance operational efficiency. The specialist will manage applications, permissions, provide operational support, and execute compliance controls.

Full-timeSeniorEnglish
2026-07-02
SportradarProduct & DevelopmentBremen, Germany

Senior C++ Software Engineer

Sportradar is seeking a Senior C++ Software Engineer to join its Sports Virtualisation team. The role involves developing innovative products using Unreal Engine 5.6+ by integrating high-performance C++ code with live skeletal tracking data. The engineer will support the team in building interactive virtual sports content, while also performing maintenance and stabilization of running systems and guiding junior developers.

Full-timeSenior3 yearsEnglish
2026-07-02
AristocratProduct & DevelopmentSkopje, North Macedonia

QA Engineer

The company is seeking a QA Engineer to ensure software product quality. This role involves completing manual test cases, assisting with test plans, and tracking defects. The engineer will collaborate with development teams, participate in testing activities, and support automation efforts. This is an opportunity for professional growth within a dedicated quality-focused team.

On-siteFull-timeMid-level1-2 yearsEnglish
2026-07-02
EntainProduct & DevelopmentHyderabad, India

Gaming Operations Executive

The Gaming Operations Executive ensures the stability, integrity, and operational performance of gaming products through advanced monitoring, automation, and risk management. The role involves combining escalation management with commercial risk oversight, focusing on game integrity, platform uptime, supplier performance, and proactive issue detection. This position is an important escalation point for complex technical incidents, requiring investigation and coordination of system-level issues and improvement of automated monitoring tools to protect revenue and player experience.

On-siteFull-timeMid-level1-3 yearsEnglish
2026-07-02