INDEX
Explanations
verbs and phrases related to demand, consequences, and change
New Auto-Interp
Negative Logits
favor
-0.14
Favor
-0.14
outers
-0.14
oram
-0.13
peek
-0.13
haven
-0.13
lie
-0.13
stem
-0.13
odash
-0.13
æĽ
-0.13
POSITIVE LOGITS
expect
0.19
sticking
0.19
NAT
0.18
natural
0.18
Expect
0.18
èĩªçĦ¶
0.17
expectation
0.17
ä¸įåı¯
0.17
Natural
0.17
Expect
0.17
Activations Density 0.012%