Optimizing the Learnable RoPE Theta Parameter in Transformers

QR Code

Optimizing the Learnable RoPE Theta Parameter in Transformers

Rotary Position Embedding (RoPE) enhances Transformer models by encoding relative positions through a frequency parameter <inline-formula> <tex-math notation="LaTeX">$\theta $ </tex-math></inline-formula>, but conventional implementations fix <inline-formula>...

Full description

Saved in:

Bibliographic Details
Main Authors:	Zhigao Huang, Musheng Chen
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Rotary position embedding learnable theta transformer optimization deep learning
Online Access:	https://ieeexplore.ieee.org/document/11084811/
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

The Subword‐Character Multi‐Scale Transformer With Learnable Positional Encoding for Machine Translation
by: Wenjing Yao, et al.
Published: (2025-07-01)

Hardware Trojan vulnerability assessment in digital integrated circuits using learnable classifiers
by: Hadi Jahanirad, et al.
Published: (2024-07-01)

A domain free of the zeros of the partial theta function
by: V. Kostov
Published: (2023-01-01)

R-Sparse R-CNN: SAR Ship Detection Based on Background-Aware Sparse Learnable Proposals
by: Kamirul Kamirul, et al.
Published: (2025-01-01)

Analysis of Gearbox Bearing Fault Diagnosis Method Based on 2D Image Transformation and 2D-RoPE Encoding
by: Xudong Luo, et al.
Published: (2025-06-01)

ON GENERALIZED EIGHTH ORDER MOCK THETA FUNCTIONS
by: Pramod Kumar Rawat
Published: (2020-07-01)

Increased frontal electroencephalogram theta amplitude in patients with anorexia nervosa compared to healthy controls
by: Hestad K, et al.
Published: (2016-09-01)

Frontal theta oscillations and cognitive flexibility: Age-related modulations in EEG activity
by: Margarita Darna, et al.
Published: (2025-01-01)

Septohippocampal acetylcholine and theta oscillations can modulate memory encoding and retrieval: Insights from a neural masses network
by: Gabriele Pirazzini, et al.
Published: (2025-09-01)

Clinical effects and correlates of standard rTMS and theta burst stimulation (TBS) on suicidal ideation in late-life depression
by: Hyewon H. Lee, et al.
Published: (2025-01-01)

Edge-Deployed Band-Split Rotary Position Encoding Transformer for Ultra-Low-Signal-to-Noise-Ratio Unmanned Aerial Vehicle Speech Enhancement
by: Feifan Liu, et al.
Published: (2025-05-01)

Advances in theta-burst transcranial magnetic stimulation for auditory comprehension deficits in post-stroke aphasia
by: Yuling Jing, et al.
Published: (2025-07-01)

A Hybrid Learnable Fusion of ConvNeXt and Swin Transformer for Optimized Image Classification
by: Jaber Qezelbash-Chamak, et al.
Published: (2025-05-01)

Exploring adjunctive continuous theta burst stimulation for treatment-resistant auditory hallucinations in schizophrenia: Insights from a case series
by: Sukriti Mukherjee, et al.
Published: (2025-06-01)

Theta-gamma phase-amplitude coupling in psychosis and healthy controls: Predicting working memory capacity across different tasks
by: Orestis Papaioannou, et al.
Published: (2025-01-01)

Linear Quadratic Regulator Control of Rotary Inverted Pendulum Using Elvis III Embedded Platform
by: Ming-Hung Lin, et al.
Published: (2025-05-01)

Is Negation Negative? (And a Discussion of Negative Concord in SOV Languages)
by: Paloma Jeretič
Published: (2025-06-01)

Effects of Intermittent Theta Burst Stimulation Combined with Motor Imagery-Based Brain Computer Interface Training on Hand Function after Stroke
by: LIAN Yawen, et al.
Published: (2025-06-01)

Effects of mirror therapy combined with theta burst stimulation on motor recovery of upper limbs after stroke: a randomized controlled study
by: Jing Zhou, et al.
Published: (2025-07-01)

Sacral magnetic neuromodulation with intermittent theta burst waveform enhances overactive bladder: In vivo study
by: Nurida Khasanah, et al.
Published: (2025-06-01)

Spatial–Frequency Fusion Network With Learnable Fractional Fourier Transform for Remote Sensing Imaging Enhancement
by: Wenyu Xu, et al.
Published: (2025-01-01)

Examining students' perspectives on the use of artificial intelligence tools in higher education: A case study on AI tools of graphic design
by: Ahmed Alsswey
Published: (2025-08-01)

Dynamic Mixture of Experts for Adaptive Computation in Character-Level Transformers
by: Zhigao Huang, et al.
Published: (2025-06-01)

“From zero to three hundred” - intensive acquisition techniques for the 300 most frequent content words in Welsh
by: Tess Fitzpatrick, et al.
Published: (2025-07-01)

Investigation of the impact of token embeddings in Transformer-based models on short-term tropical cyclone track and intensity predictions
by: Yuan-Jiang Zeng, et al.
Published: (2025-12-01)

The Modified Stochastic Theta Scheme for Mean-Field Stochastic Differential Equations Driven by G-Brownian Motion Under Local One-Sided Lipschitz Conditions
by: Pengfei Zhao, et al.
Published: (2025-06-01)

The Fine Feature Extraction and Attention Re-Embedding Model Based on the Swin Transformer for Pavement Damage Classification
by: Shizheng Zhang, et al.
Published: (2025-06-01)

Methodological Problems and Strategic Goals of the Work on Creation of the Theory and Technology of New Generation Intelligent Computer Systems
by: V. V. Golenkov, et al.
Published: (2024-03-01)

Effects of intermittent theta burst to the left dorsolateral prefrontal cortex on brain volumes and neurometabolites in people with alcohol use disorder: a preliminary investigation
by: Timothy C. Durazzo, et al.
Published: (2025-07-01)

Research of Energy Technological Parameters in the Processes of Heat Pump Utilization of Heat Exhaust Gases of Rotary Kilns
by: Petrash V.D., et al.
Published: (2021-06-01)

Capabilities assessment for working bodies of rotary snowblowers through design and technological parameters study
by: D. S. Aleshkov, et al.
Published: (2023-07-01)

Control of Regime of Unified Interphase Power Controller by the Use of Rotary Transformer
by: Kalinin L.P., et al.
Published: (2015-12-01)

Oscillatory and evoked neural responses underlying gating in the primary somatosensory cortices: Evidence from optically-pumped magnetometry
by: Yasra Arif, et al.
Published: (2025-09-01)

Comparison of Language Models for English-Latvian Semantic Search
by: Kucheravy Artem, et al.
Published: (2025-01-01)

An Improved Multi-Objective Adaptive Human Learning Optimization Algorithm and Its Application in Optimizing Formulation Schemes for Rotary Hearth Furnaces
by: Jun Yao, et al.
Published: (2025-06-01)

Steganalysis of Adaptive Multi-Rate Speech with Unknown Embedding Rates Using Multi-Scale Transformer and Multi-Task Learning Mechanism
by: Congcong Sun, et al.
Published: (2025-06-01)

Spectral Adaptive Dropout: Frequency-Based Regularization for Improved Generalization
by: Zhigao Huang, et al.
Published: (2025-06-01)

The Optimization of Mechanical Phase-Shifting Transformer Tap Positions Based on an Open-Loop and Closed-Loop Hybrid Strategy
by: Jinjiao Lin, et al.
Published: (2025-07-01)

Frontal Transcranial Direct Current Stimulation in Moderate to Severe Depression: Clinical and Neurophysiological Findings from a Pilot Study
by: Florin Zamfirache, et al.
Published: (2025-05-01)

Embedded machine learning for fault detection in conveyor systems using multi-sensor data and discrete wavelet transform
by: Hoang Duc Do, et al.
Published: (2025-07-01)