Publications& Projects
Research papers and open-source projects advancing AI compute efficiency
Discover our contributions to the field of AI computing through peer-reviewed research and practical tools for the research community.
Open SourceProjects
Tools and frameworks for better compute utilization
OptiML
An open-source toolkit for optimizing machine learning workloads with automatic memory management and hardware-aware optimizations
Key Features:
- Automatic memory optimization
- Mixed precision training
- Model compression
- Hardware-aware scheduling
GPUProfiler
Advanced GPU profiling and optimization tool for deep learning workloads with real-time monitoring and intelligent suggestions
Key Features:
- Real-time performance monitoring
- Memory usage analysis
- Kernel optimization suggestions
- Tensor Core utilization tracking
DistTrain
High-performance distributed training framework with support for multiple parallelism strategies and fault tolerance
Key Features:
- Multi-GPU support
- Fault tolerance
- Auto-scaling
- Gradient compression
- 3D parallelism
FlashAttention
Implementation of memory-efficient attention mechanisms for long sequence modeling
Key Features:
- IO-aware computation
- Memory efficiency
- Hardware acceleration
- Drop-in replacement
AutoML-Zero
Automated machine learning framework for discovering optimal neural architectures and hyperparameters
Key Features:
- Architecture search
- Hyperparameter optimization
- Multi-objective optimization
- Hardware constraints
QuantumFlow
Hybrid quantum-classical computing framework for AI applications with quantum circuit optimization
Key Features:
- Quantum circuit optimization
- Hybrid algorithms
- Error mitigation
- Hardware integration