INDEX
Explanations
phrases related to requests and commands
New Auto-Interp
Negative Logits
Dün
-0.16
lyn
-0.16
ogo
-0.15
osp
-0.14
neau
-0.14
tram
-0.14
.bc
-0.14
strup
-0.14
ontent
-0.14
ourselves
-0.13
POSITIVE LOGITS
please
0.30
please
0.26
Please
0.21
Please
0.21
PLEASE
0.21
bitte
0.19
请
0.18
ple
0.18
ï¼Į请
0.17
SHOW
0.16
Activations Density 0.112%