INDEX
Explanations
mentions of future actions by individuals
instances of the word "would" indicating habitual actions or future intentions
New Auto-Interp
Negative Logits
Corpus
-0.65
rylic
-0.65
resources
-0.65
Efficiency
-0.65
Gutenberg
-0.64
values
-0.64
advant
-0.62
effic
-0.62
Kag
-0.59
ractical
-0.59
POSITIVE LOGITS
be
1.00
doubtless
0.96
undoubtedly
0.89
gladly
0.89
dearly
0.88
never
0.85
eventually
0.85
ĸļ
0.83
surely
0.82
've
0.79
Activations Density 0.145%