INDEX
Explanations
phrases indicating a comparative state or condition
New Auto-Interp
Negative Logits
lew
-0.15
hq
-0.15
Ñĸп
-0.14
aç
-0.14
ÃŃd
-0.14
orts
-0.13
znik
-0.13
icens
-0.13
zano
-0.13
uÄį
-0.13
POSITIVE LOGITS
there
0.25
there
0.20
none
0.19
it
0.18
ap
0.17
nobody
0.17
opposed
0.16
There
0.16
There
0.16
although
0.16
Activations Density 0.092%