INDEX
Explanations
dates and time-related information
New Auto-Interp
Negative Logits
votes
-0.68
OV
-0.60
laughs
-0.60
Args
-0.58
Brother
-0.57
ADS
-0.57
gery
-0.56
Recomm
-0.56
LY
-0.55
WARD
-0.54
POSITIVE LOGITS
are
1.47
were
1.38
aren
1.37
weren
1.33
vary
1.18
pring
1.14
abound
1.11
differ
1.11
appear
1.09
mith
1.08
Activations Density 1.721%