INDEX
Explanations
negative sentiment and criticism towards various subjects
New Auto-Interp
Negative Logits
caveats
-0.80
loopholes
-0.70
kindness
-0.69
snap
-0.67
decency
-0.66
swoop
-0.66
uncertainties
-0.65
Loll
-0.64
assumptions
-0.63
Panic
-0.63
POSITIVE LOGITS
founder
1.39
director
1.24
leader
1.19
chair
1.17
authored
1.16
signed
1.14
creator
1.11
producing
1.11
author
1.10
founded
1.08
Activations Density 0.012%