INDEX
Explanations
references to historical events and social justice issues
New Auto-Interp
Negative Logits
Truthy
-0.17
ataire
-0.15
$MESS
-0.15
ngo
-0.15
ucwords
-0.14
zar
-0.14
GM
-0.14
readcr
-0.14
berger
-0.13
monic
-0.13
POSITIVE LOGITS
197
0.20
196
0.18
later
0.18
195
0.18
history
0.17
194
0.17
decades
0.16
era
0.15
History
0.15
decade
0.14
Activations Density 0.998%