INDEX
Explanations
adjectives describing purity
references to concepts and qualities associated with purity
New Auto-Interp
Negative Logits
apter
-0.82
closest
-0.70
prominently
-0.70
otos
-0.65
ieri
-0.64
agements
-0.60
Peninsula
-0.59
Mub
-0.59
closely
-0.58
ails
-0.58
POSITIVE LOGITS
bred
1.29
st
0.97
waters
0.93
blood
0.89
r
0.86
blooded
0.86
bliss
0.82
enjoyment
0.80
adrenaline
0.79
evil
0.77
Activations Density 0.022%