INDEX
Explanations
sentences indicating strong emotion or emphasis
statements affirming or asserting something
New Auto-Interp
Negative Logits
dies
-0.78
elve
-0.74
luaj
-0.74
asers
-0.72
esm
-0.70
ievers
-0.70
styles
-0.69
IDs
-0.69
anguages
-0.68
ses
-0.67
POSITIVE LOGITS
happening
0.97
definitely
0.92
supposed
0.91
unacceptable
0.89
NOT
0.88
gonna
0.87
shaping
0.85
bullshit
0.82
ridiculous
0.82
hardly
0.80
Activations Density 0.110%