INDEX
Explanations
terms related to scale and magnitude
New Auto-Interp
Negative Logits
Smithsonian
-0.19
Sunder
-0.17
Sweat
-0.17
iful
-0.16
Samantha
-0.16
ugins
-0.15
SVN
-0.15
Sierra
-0.15
Sidd
-0.15
Sandy
-0.15
POSITIVE LOGITS
scale
0.77
Scale
0.68
scale
0.65
scales
0.61
-scale
0.61
Scale
0.60
_scale
0.57
.scale
0.57
SCALE
0.56
scale
0.50
Activations Density 0.081%