INDEX
Explanations
instances of the word "sun" or its variations
New Auto-Interp
Negative Logits
dorf
-0.22
enton
-0.16
gov
-0.15
964
-0.15
cheng
-0.14
opher
-0.14
cum
-0.14
leans
-0.14
cx
-0.14
omor
-0.14
POSITIVE LOGITS
ken
0.33
flower
0.31
sets
0.28
flowers
0.26
burst
0.26
nier
0.25
niest
0.25
Microsystems
0.25
rise
0.24
screens
0.24
Activations Density 0.025%