INDEX
Explanations
dates, particularly days of the week within the text
New Auto-Interp
Negative Logits
670
-0.15
sed
-0.14
Į¨
-0.14
ething
-0.13
.servers
-0.13
ountry
-0.13
sed
-0.13
ει
-0.13
Slate
-0.13
errick
-0.13
POSITIVE LOGITS
arde
0.16
asco
0.15
erez
0.15
roe
0.15
ULD
0.15
bih
0.14
.instrument
0.14
asha
0.14
acional
0.14
kÃ¶ÅŁ
0.14
Activations Density 0.028%