INDEX
Explanations
polite requests or directives
New Auto-Interp
Negative Logits
icap
-0.17
над
-0.16
want
-0.15
phải
-0.15
Want
-0.15
bisher
-0.15
lẽ
-0.14
Want
-0.14
dsn
-0.14
mand
-0.14
POSITIVE LOGITS
note
0.33
Note
0.29
feel
0.29
don
0.27
excuse
0.25
be
0.24
bear
0.22
remember
0.22
consider
0.22
see
0.21
Activations Density 0.032%