INDEX
Explanations
indicators of comparison or gradation
New Auto-Interp
Negative Logits
Up
-0.15
Up
-0.15
alet
-0.14
cce
-0.14
leitung
-0.14
unal
-0.13
รà¸ĵ
-0.13
culate
-0.13
arah
-0.13
alie
-0.13
POSITIVE LOGITS
over
0.42
under
0.41
less
0.29
under
0.28
fewer
0.26
_under
0.26
shy
0.26
-under
0.24
dÆ°á»Ľi
0.24
below
0.24
Activations Density 0.023%