INDEX
Explanations
references to religious leaders and communal values
New Auto-Interp
Negative Logits
undos
-0.15
Boone
-0.15
xcc
-0.15
ucene
-0.14
oload
-0.14
ktop
-0.13
Sabb
-0.13
arton
-0.13
Norris
-0.13
uese
-0.13
POSITIVE LOGITS
ueblo
0.16
chl
0.15
avors
0.15
hum
0.14
asc
0.14
ãģ¿
0.14
aws
0.14
Kraj
0.14
atown
0.14
ازÙĦ
0.14
Activations Density 0.233%