INDEX
Explanations
mentions of debates or discussions
occurrences of the word "argument."
New Auto-Interp
Negative Logits
eco
-0.74
oho
-0.73
ummer
-0.70
livest
-0.67
cler
-0.66
ookie
-0.64
appy
-0.63
sight
-0.62
stocking
-0.60
engineering
-0.60
POSITIVE LOGITS
uments
1.11
arguments
1.03
ative
0.94
argument
0.92
arguing
0.83
ļéĨĴ
0.82
argument
0.81
ument
0.79
against
0.78
persu
0.78
Activations Density 0.024%