INDEX
Explanations
mentions of control, power, or authority
phrases and expressions of frustration or dissatisfaction
New Auto-Interp
Negative Logits
ODUCT
-0.72
20439
-0.68
HCR
-0.67
ourse
-0.66
ItemImage
-0.64
estone
-0.64
catentry
-0.63
Individual
-0.63
iHUD
-0.62
etermined
-0.62
POSITIVE LOGITS
ya
1.06
dudes
1.01
nerds
0.98
goddamn
0.95
kidding
0.95
damn
0.94
dude
0.94
shit
0.92
fucking
0.92
crap
0.90
Activations Density 1.563%