INDEX
Explanations
words signifying levels and comparisons in various contexts, often relating to scarcity or abundance
New Auto-Interp
Negative Logits
esion
-0.15
lessness
-0.15
urus
-0.14
ooks
-0.14
neau
-0.14
IDES
-0.14
ìļ°ë¦¬
-0.14
apol
-0.14
tracer
-0.14
림
-0.13
POSITIVE LOGITS
enough
0.22
ly
0.19
ikt
0.18
indeed
0.17
across
0.17
nhau
0.16
fold
0.16
ingly
0.16
ized
0.16
itud
0.15
Activations Density 0.204%