INDEX
Explanations
declarative questions or statements about identity or designation
New Auto-Interp
Negative Logits
vale
-0.16
fst
-0.15
anything
-0.14
uren
-0.14
insky
-0.14
amba
-0.14
omo
-0.14
erais
-0.14
anz
-0.14
Fusion
-0.14
POSITIVE LOGITS
rish
0.14
çĭł
0.14
singular
0.14
_locator
0.14
alara
0.13
adow
0.13
Wikispecies
0.13
omanip
0.13
estr
0.13
زر
0.13
Activations Density 0.017%