INDEX
Explanations
instances where individuals are vocalizing opinions or criticisms
instances of speaking out or expressing opinions against various issues
New Auto-Interp
Negative Logits
rede
-0.64
vous
-0.60
Gadget
-0.60
liction
-0.59
rolls
-0.59
ahon
-0.58
turnover
-0.58
Lann
-0.58
oxidation
-0.57
oreal
-0.57
POSITIVE LOGITS
stretched
0.99
loud
0.94
louder
0.82
loudly
0.81
mbuds
0.78
lier
0.78
eka
0.76
rage
0.76
olate
0.75
burst
0.72
Activations Density 0.029%