HTEC Group is a global consulting, software engineering, and digital product development company that empowers the world's best high-tech companies, disruptive startups, and global enterprises with innovative product design and sophisticated engineering services.
HTEC Group was founded in 2008 in Belgrade, Serbia and today has its global headquarters in San Francisco. The company has consultancy, innovation, and product design offices in Silicon Valley, New York, and London, with its technological heart spread across development centers in Central and Southeast Europe. Overall, HTEC employs more than 2,000 highly skilled professionals in 29 locations in 12 countries.
HTEC combines Silicon Valley-based design thinking with the best engineering talent to support global clients with complete digital product development, from strategy and conceptualization to digital product design and agile engineering at scale. The company possesses vast expertise across a multitude of domains, including Healthcare, Retail, Transportation and Smart Mobility, Logistics, FinTech, Green Energy, Media, and Deep Technology.
Opis posla
How you’ll contribute:
Implement advanced compiler techniques to enhance the performance and efficiency of ML models
Develop and maintain tools for the efficient compilation and execution of ML models
Conduct performance analysis and benchmarking to identify bottlenecks and areas for improvement
Implement and optimize various kernels utilizing HW-specific instructions
Document and present findings, optimizations, and improvements to stakeholders
Kvalifikacije
Required Qualifications:
Bachelor's or Master's degree in Electrical Engineering, Computer Science, or related field
Proficiency in C/C++
Willingness to work on low-level software
Understanding of computer architecture
Basic Python experience
Experience with Git and TDD principle
Strong analytical and problem-solving skills
Dodatne Informacije
Nice to have:
Hands-on experience in writing SIMD and/or multi-threaded high-performance code, as well as target-specific optimizations
Familiarity with any DSP architectures and hardware acceleration techniques is a plus
Experience with performance profiling tools and methodologies
Basic understanding of ML concepts
Basic knowledge of Linear Algebra and matrix operations
Understanding of transformer models used in large language models