INDEX
Explanations
statements made by political figures
quotes from notable figures
New Auto-Interp
Negative Logits
mination
-0.80
otin
-0.79
vier
-0.74
Quality
-0.73
fet
-0.69
notations
-0.69
UTH
-0.69
uploads
-0.67
kn
-0.67
fur
-0.67
POSITIVE LOGITS
sarcast
1.03
rhet
0.95
quoted
0.85
excerpts
0.83
passionately
0.83
referring
0.82
quoting
0.82
bluntly
0.79
scathing
0.78
addressing
0.76
Activations Density 0.223%