INDEX
Explanations
phrases indicating disagreement or debate
instances of the word "argue" and its variations
New Auto-Interp
Negative Logits
Seym
-0.67
beam
-0.67
cy
-0.66
forms
-0.64
finger
-0.63
fing
-0.61
psy
-0.61
Adds
-0.60
photos
-0.59
gallery
-0.58
POSITIVE LOGITS
against
0.99
persu
0.92
ative
0.90
vehemently
0.90
convinc
0.89
atively
0.88
forcefully
0.82
passionately
0.82
arians
0.78
loudly
0.77
Activations Density 0.035%