INDEX
Explanations
mentions of alternatives or choices in various contexts
New Auto-Interp
Negative Logits
arna
-0.17
973
-0.17
653
-0.17
gaard
-0.14
akin
-0.14
418
-0.14
Initializing
-0.14
ναν
-0.14
782
-0.14
385
-0.14
POSITIVE LOGITS
several
0.25
Bust
0.18
ific
0.18
multiple
0.18
ifice
0.17
plusieurs
0.17
Several
0.16
prefer
0.16
å¤ļ
0.16
eres
0.16
Activations Density 0.017%