INDEX
Explanations
descriptive adjectives that highlight attractive features of places
New Auto-Interp
Negative Logits
omes
-0.18
allest
-0.16
habit
-0.15
ome
-0.15
OME
-0.14
nest
-0.14
ignon
-0.14
etro
-0.14
oust
-0.14
äch
-0.13
POSITIVE LOGITS
slice
0.24
corner
0.22
former
0.22
spit
0.20
0.19
bend
0.19
gem
0.19
Slice
0.18
id
0.18
stretch
0.18
Activations Density 0.110%