INDEX
Explanations
the term "well" in various contexts
New Auto-Interp
Negative Logits
nier
-0.19
-basket
-0.17
arius
-0.16
ray
-0.16
_notifier
-0.15
at
-0.14
noop
-0.14
missions
-0.14
aida
-0.14
ounty
-0.14
POSITIVE LOGITS
beyond
0.22
underway
0.20
intention
0.20
ington
0.19
worth
0.18
aware
0.18
INGTON
0.18
spring
0.18
below
0.17
outside
0.17
Activations Density 0.021%