INDEX
Explanations
phrases indicating expectations or anticipations related to upcoming events or experiences
New Auto-Interp
Negative Logits
given
-0.16
hausen
-0.15
oš
-0.14
elmet
-0.14
aroo
-0.14
illian
-0.14
ı
-0.14
given
-0.13
æĤ
-0.13
eteria
-0.13
POSITIVE LOGITS
Expect
0.35
expect
0.35
expect
0.33
Expect
0.33
expectation
0.33
expectations
0.32
EXPECT
0.29
_expect
0.28
EXPECT
0.28
expecting
0.28
Activations Density 0.104%