INDEX
Explanations
references to divine authority and religious figures
New Auto-Interp
Negative Logits
ansas
-0.16
Shaman
-0.16
odon
-0.15
urrection
-0.15
пиÑģ
-0.14
ubishi
-0.14
izio
-0.14
dar
-0.14
imet
-0.14
éĩİ
-0.14
POSITIVE LOGITS
ÅĻik
0.15
omba
0.15
Jehovah
0.15
AGR
0.14
rief
0.14
adel
0.14
çĺ
0.14
asco
0.14
\Modules
0.14
δά
0.14
Activations Density 0.008%