INDEX
Explanations
adverbs that express frequency or likelihood
words and phrases that indicate frequency or typical behavior
New Auto-Interp
Negative Logits
pron
-0.73
athered
-0.72
ËĪ
-0.68
arthy
-0.64
"},"
-0.64
adr
-0.64
umbn
-0.63
spl
-0.62
âĹ¼
-0.61
him
-0.61
POSITIVE LOGITS
adays
0.82
Ago
0.81
terday
0.78
terness
0.71
theless
0.71
Helpful
0.70
nown
0.69
Comes
0.69
etheless
0.68
Ones
0.67
Activations Density 0.094%