INDEX
Explanations
verbs indicating actions or states of being in past and present tense
New Auto-Interp
Negative Logits
etheless
-0.81
äºĶ
-0.81
PDATE
-0.75
disobedience
-0.69
ç·
-0.67
somew
-0.66
theless
-0.65
issance
-0.65
å¼
-0.65
åij
-0.64
POSITIVE LOGITS
ilon
1.11
esley
0.90
iston
0.89
ams
0.89
ley
0.86
afe
0.82
endor
0.82
avan
0.82
aq
0.82
zel
0.80
Activations Density 0.070%