INDEX
Explanations
instances of the word "down" and variations in various contexts
New Auto-Interp
Negative Logits
ylland
-0.17
wins
-0.16
cury
-0.15
aceous
-0.15
aurant
-0.15
zzo
-0.15
anmar
-0.15
zioni
-0.15
astro
-0.14
phalt
-0.14
POSITIVE LOGITS
-down
0.21
ning
0.16
erb
0.16
graded
0.16
asser
0.15
ned
0.15
à¥įह
0.15
ward
0.14
/up
0.14
ed
0.14
Activations Density 0.083%