INDEX
Explanations
adjectives or phrases related to impactful or influential attributes
terms associated with emotional intensity and dramatic elements
New Auto-Interp
Negative Logits
aborted
-0.68
moratorium
-0.65
reports
-0.64
killed
-0.63
repaired
-0.62
untary
-0.62
abbit
-0.60
unprotected
-0.60
hooting
-0.59
ALK
-0.59
POSITIVE LOGITS
iness
1.02
lihood
0.99
enance
0.88
notations
0.84
tones
0.81
iveness
0.75
aic
0.74
vis
0.74
terness
0.72
uality
0.71
Activations Density 0.254%