INDEX
Explanations
terms related to scaling
New Auto-Interp
Negative Logits
**/
-0.67
RetentionPolicy
-0.61
AndEndTag
-0.59
__*/
-0.58
ifolium
-0.56
ICIENCY
-0.56
'][$
-0.55
Gemeinden
-0.55
wój
-0.54
🏽
-0.54
POSITIVE LOGITS
Scales
1.38
scales
1.37
Scales
1.30
SCALE
1.27
Scale
1.22
scales
1.20
Scale
1.15
SCALE
1.14
scale
1.09
Scal
1.06
Activations Density 0.080%