INDEX
Explanations
phrases indicating the absence or lack of something
New Auto-Interp
Negative Logits
Various
-0.23
Numerous
-0.19
Various
-0.18
patrick
-0.16
various
-0.15
Things
-0.15
Most
-0.15
rape
-0.14
most
-0.14
whatever
-0.14
POSITIVE LOGITS
thin
0.36
-one
0.35
xious
0.32
things
0.28
isy
0.28
longer
0.27
one
0.26
discern
0.26
further
0.25
ël
0.25
Activations Density 0.117%