INDEX
Explanations
phrases that relate to granting or seeking permission
New Auto-Interp
Negative Logits
ÙĪØ±Ø´
-0.16
otos
-0.16
боÑĤ
-0.15
agner
-0.14
енÑı
-0.14
ÑĢÑĥÑĩ
-0.14
544
-0.14
(~(
-0.14
æį
-0.14
reviewer
-0.13
POSITIVE LOGITS
use
0.23
freely
0.20
access
0.17
unlimited
0.17
opic
0.17
Use
0.17
use
0.16
phép
0.16
Use
0.15
å¯
0.15
Activations Density 0.206%