INDEX
Explanations
specific references to people, their actions, and events related to mortality or accusations
New Auto-Interp
Negative Logits
det
-0.15
eds
-0.14
SND
-0.14
avenport
-0.14
sessions
-0.14
Äħd
-0.14
ait
-0.14
pective
-0.14
pii
-0.14
412
-0.13
POSITIVE LOGITS
-animate
0.16
æĤł
0.16
ì´Ī
0.15
ERA
0.15
-su
0.14
aged
0.14
اسÙĩ
0.14
Ŀi
0.14
صÙĪØ±
0.14
âĹĦ
0.14
Activations Density 0.030%