INDEX
Explanations
phrases indicating a selection or variety of choices available
New Auto-Interp
Negative Logits
ÌĨ
-0.15
̧
-0.14
EATURE
-0.14
imbus
-0.14
DeÄŁer
-0.13
mamak
-0.13
lamaya
-0.13
Entered
-0.13
SENS
-0.13
ils
-0.13
POSITIVE LOGITS
choice
0.90
choose
0.79
choices
0.75
choice
0.75
Choice
0.73
-choice
0.71
Choose
0.68
choose
0.68
Choice
0.68
choosing
0.67
Activations Density 0.188%