Search Results - Musheng Chen
- Showing 1 - 3 results of 3
-
1
Optimizing the Learnable RoPE Theta Parameter in Transformers by Zhigao Huang, Musheng Chen
Published 2025-01-01
Article -
2
Dynamic Mixture of Experts for Adaptive Computation in Character-Level Transformers by Zhigao Huang, Musheng Chen, Shiyan Zheng
Published 2025-06-01
Article -
3
Spectral Adaptive Dropout: Frequency-Based Regularization for Improved Generalization by Zhigao Huang, Musheng Chen, Shiyan Zheng
Published 2025-06-01
Article