INDEX
Explanations
words related to malevolent or sinister concepts
references to the concept of evil
New Auto-Interp
Negative Logits
dropping
-0.78
eding
-0.78
phrine
-0.77
akeru
-0.76
GN
-0.74
UNCH
-0.72
aro
-0.71
ribe
-0.71
raltar
-0.71
illation
-0.71
POSITIVE LOGITS
deeds
0.96
deed
0.93
incarn
0.92
twin
0.89
mastermind
0.84
nesses
0.83
genius
0.80
NESS
0.79
empire
0.76
lord
0.76
Activations Density 0.034%