INDEX
Explanations
statements about actions taken by a group or individuals
New Auto-Interp
Negative Logits
scor
-0.15
ɵ
-0.15
æĥł
-0.15
_compile
-0.14
eteria
-0.14
ktor
-0.14
992
-0.14
983
-0.13
zeich
-0.13
targ
-0.13
POSITIVE LOGITS
zik
0.15
Loc
0.15
ó
0.15
ijd
0.14
antan
0.14
Gle
0.14
Durham
0.14
ceae
0.14
Vine
0.14
dux
0.14
Activations Density 0.286%