INDEX
Explanations
requests for assistance or action
New Auto-Interp
Negative Logits
phải
-0.15
icap
-0.15
mand
-0.15
lesai
-0.15
sonian
-0.14
YNC
-0.14
над
-0.14
ÑĢава
-0.14
regor
-0.14
quire
-0.14
POSITIVE LOGITS
note
0.30
Note
0.26
don
0.22
Note
0.21
excuse
0.19
-note
0.19
note
0.17
dont
0.16
NOTE
0.16
_note
0.16
Activations Density 0.025%