You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Model created by analyzing and selecting the optimal layers from other Qwen2.5-7B models based on their dimensional utilization efficiency, measured by the Normalized Effective Rank (NER). Computed like:
1414
+
Input: Weight matrix for each model layer
1415
+
Compute singular values σᵢ where σᵢ ≥ 0 # σᵢ represents the importance of each dimension
1416
+
Filter values above numerical threshold (>1e-12)
1417
+
Sum all singular values: S = Σσᵢ # S acts as normalization factor
1418
+
Create probability distribution: pᵢ = σᵢ/S # converts singular values to probabilities summing to 1
1419
+
Compute Shannon entropy: H = -Σ(pᵢ * log₂(pᵢ)) # measures information content
1420
+
Calculate maximum possible entropy: H_max = log₂(n)
1421
+
Final NER score = H/H_max # normalizes score to [0,1] range
1422
+
Results in value between 0 and 1 for each model layer
0 commit comments