INDEX
Explanations
phrases indicating ability and competence in various tasks
New Auto-Interp
Negative Logits
Try
-0.18
try
-0.17
try
-0.16
icens
-0.16
Try
-0.16
è¯ķ
-0.15
pec
-0.15
tries
-0.15
試
-0.15
gon
-0.15
POSITIVE LOGITS
spot
0.21
Spot
0.20
handle
0.20
Spot
0.20
read
0.20
handling
0.20
-spot
0.19
READ
0.19
multit
0.19
multi
0.19
Activations Density 0.241%