INDEX
Explanations
negative expressions or criticisms
negative emotions and words associated with failure or disappointment
New Auto-Interp
Negative Logits
ancies
-1.14
rams
-0.84
poons
-0.83
fixes
-0.81
alks
-0.81
icts
-0.80
ensions
-0.79
profiles
-0.78
timelines
-0.78
ynski
-0.76
POSITIVE LOGITS
unto
0.96
breaker
0.90
worth
0.85
reel
0.85
worthy
0.81
compared
0.79
nonetheless
0.79
affair
0.79
akin
0.76
starter
0.75
Activations Density 0.225%