INDEX
Explanations
religious mentions or exclamations
mentions of the word "God" or variations of it
New Auto-Interp
Negative Logits
Luxem
-0.74
contin
-0.62
detachment
-0.62
Altern
-0.61
phrine
-0.61
URRENT
-0.61
akeru
-0.61
adul
-0.60
DERR
-0.59
auth
-0.58
POSITIVE LOGITS
father
1.27
frey
1.25
mother
1.20
forbid
1.08
parents
1.04
win
1.01
zilla
0.99
god
0.97
bless
0.96
Almighty
0.95
Activations Density 0.032%