INDEX
Explanations
dates mentioned in parentheses
opening parentheses in text
New Auto-Interp
Negative Logits
everyday
-0.80
untreated
-0.76
deport
-0.73
wre
-0.72
imperson
-0.71
retard
-0.71
overpower
-0.71
intangible
-0.71
extingu
-0.71
mature
-0.71
POSITIVE LOGITS
pictured
1.36
which
1.34
see
1.34
via
1.30
although
1.25
1.24
1.24
thanks
1.22
sic
1.22
http
1.22
Activations Density 0.166%