INDEX
Explanations
references to significant personal losses and tragedies
New Auto-Interp
Negative Logits
ycop
-0.15
teri
-0.14
426
-0.14
736
-0.13
district
-0.13
deen
-0.13
loff
-0.13
245
-0.13
gener
-0.13
Trad
-0.13
POSITIVE LOGITS
ove
0.17
oog
0.17
odb
0.16
ovy
0.15
anch
0.15
Verdana
0.15
ضÙĬ
0.14
.truth
0.14
frage
0.14
mund
0.14
Activations Density 0.222%