INDEX
Explanations
statements about the status or condition of various subjects or entities
New Auto-Interp
Negative Logits
ajo
-0.17
ensis
-0.16
Armed
-0.15
flush
-0.15
ifr
-0.15
Rough
-0.14
ิà¸ģ
-0.14
izar
-0.14
860
-0.14
Armour
-0.14
POSITIVE LOGITS
anyhow
0.18
eing
0.17
udiant
0.15
synonym
0.15
gonna
0.15
bk
0.15
åĩī
0.15
ija
0.15
gio
0.14
ذات
0.14
Activations Density 0.284%