INDEX
Explanations
phrases related to religious beliefs and moral concepts
expressions of moral judgment and concepts related to justice and suffering
New Auto-Interp
Negative Logits
ĸļ
-0.95
GOODMAN
-0.94
Timeline
-0.86
updated
-0.79
Ready
-0.78
Baby
-0.76
helicop
-0.76
uably
-0.75
wow
-0.75
Aerospace
-0.74
POSITIVE LOGITS
sinful
1.62
sins
1.58
sinners
1.48
wicked
1.46
sin
1.43
evils
1.43
unbel
1.42
folly
1.41
malice
1.36
blasp
1.36
Activations Density 0.234%