INDEX
Explanations
references to religious text and figures related to Christianity
New Auto-Interp
Negative Logits
deaux
-0.15
pl
-0.15
ipse
-0.15
(
-0.14
ienne
-0.14
extr
-0.14
processes
-0.14
ees
-0.14
sab
-0.14
sacrifices
-0.13
POSITIVE LOGITS
urdy
0.16
tero
0.15
ä¼¼
0.15
\common
0.15
oron
0.15
arcer
0.14
è¢ĸ
0.14
ARIANT
0.14
дем
0.14
ikler
0.14
Activations Density 0.136%