INDEX
Explanations
references to small or low characteristics in various contexts
New Auto-Interp
Negative Logits
umab
-0.55
uesas
-0.50
mployment
-0.49
pidou
-0.48
colesterol
-0.47
-0.45
closedir
-0.45
iasis
-0.45
ango
-0.45
textAlignment
-0.45
POSITIVE LOGITS
small
1.35
small
1.27
SMALL
1.26
Small
1.23
Small
1.20
tiny
1.18
SMALL
1.15
smaller
1.02
کوچک
1.02
smal
1.00
Activations Density 0.861%