INDEX
Explanations
striking descriptive adjectives
New Auto-Interp
Negative Logits
on
-2.19
in
-2.16
at
-1.99
took
-1.94
came
-1.88
went
-1.87
get
-1.85
from
-1.84
or
-1.82
all
-1.80
POSITIVE LOGITS
myriad
2.05
quintessential
1.90
acclaimed
1.89
revamped
1.89
ridiculously
1.88
wacky
1.86
strikingly
1.84
colossal
1.81
shimmering
1.81
insanely
1.78
Activations Density 0.008%