INDEX
Explanations
words related to arguments or debates
references to logical or rhetorical arguments
New Auto-Interp
Negative Logits
Carbuncle
-0.69
ISTER
-0.63
idays
-0.62
Atomic
-0.62
lights
-0.62
PHOTO
-0.61
ookie
-0.60
unmarked
-0.60
aches
-0.60
ardy
-0.59
POSITIVE LOGITS
ative
1.30
uments
1.14
against
1.05
abl
1.03
ument
0.99
ation
0.92
ator
0.91
arguments
0.88
atives
0.88
atively
0.85
Activations Density 0.040%