INDEX
Explanations
superlative adjectives indicating extremes of size or quality
New Auto-Interp
Negative Logits
allet
-0.17
481
-0.16
isher
-0.15
llib
-0.14
scene
-0.14
ernet
-0.14
usic
-0.14
lets
-0.14
orman
-0.14
adoo
-0.14
POSITIVE LOGITS
-ever
0.24
ablish
0.21
s
0.17
most
0.15
elerik
0.15
-known
0.14
unta
0.14
-selling
0.14
-vous
0.14
mys
0.14
Activations Density 0.062%