INDEX
Explanations
phrases indicating universality or totality
New Auto-Interp
Negative Logits
atmos
-0.14
oslav
-0.14
orthand
-0.14
koc
-0.14
prise
-0.14
ayıp
-0.13
urga
-0.13
lexport
-0.13
APPLE
-0.13
utz
-0.13
POSITIVE LOGITS
centralized
0.18
proper
0.18
decentral
0.15
Proper
0.14
ifton
0.14
decentralized
0.14
ple
0.14
icht
0.14
iesel
0.14
Soccer
0.14
Activations Density 0.000%