INDEX
Explanations
phrases related to religious contexts and expressions
expressions of gratitude or exclamations directed towards a divine figure
New Auto-Interp
Negative Logits
Luxem
-0.77
kindred
-0.66
adul
-0.66
detachment
-0.66
phrine
-0.65
Altern
-0.64
ACP
-0.63
auth
-0.60
Sapp
-0.59
URRENT
-0.59
POSITIVE LOGITS
frey
1.22
father
1.17
mother
1.14
forbid
1.03
god
0.98
win
0.96
zilla
0.96
parents
0.96
speed
0.86
boss
0.86
Activations Density 0.038%