INDEX
Explanations
exclamations expressing frustration or confusion
expressions of frustration or disbelief
New Auto-Interp
Negative Logits
Joy
-0.76
Cosponsors
-0.75
Vert
-0.74
NetMessage
-0.71
%:
-0.69
ufact
-0.69
zes
-0.68
PsyNetMessage
-0.67
ause
-0.67
CI
-0.65
POSITIVE LOGITS
hole
0.81
holes
0.78
ishly
0.73
urous
0.72
dump
0.70
damn
0.67
agna
0.65
fuck
0.64
grin
0.63
kidding
0.63
Activations Density 0.022%