INDEX
Explanations
occurrences of future-oriented verb phrases and modal verbs
New Auto-Interp
Negative Logits
oled
-0.16
aney
-0.15
esium
-0.15
èĪ
-0.15
EEP
-0.14
inu
-0.14
ilver
-0.14
串
-0.14
åĢī
-0.14
uess
-0.13
POSITIVE LOGITS
oba
0.15
_truth
0.15
compreh
0.15
twins
0.14
Brewer
0.14
appar
0.14
Cunningham
0.14
ÏĢα
0.14
mark
0.13
ÃĿ
0.13
Activations Density 0.001%