INDEX
Explanations
instances of rebuttal or counterarguments
New Auto-Interp
Negative Logits
â̦↵
-0.18
â̦but
-0.17
â̦and
-0.15
â̦it
-0.15
â̦I
-0.14
(
-0.14
[]
-0.14
â̦
-0.14
â̦↵
-0.13
Ì£
-0.13
POSITIVE LOGITS
EXEMPLARY
0.17
CHARSET
0.16
.scalablytyped
0.15
OVERRIDE
0.15
ãĢĤæľ¬
0.14
backpage
0.14
INTERRUPTION
0.13
contri
0.13
ADDE
0.13
लà¤Ĺ
0.13
Activations Density 2.127%