INDEX
Explanations
mentions of proper nouns
New Auto-Interp
Negative Logits
bucks
-0.75
worms
-0.74
eryl
-0.70
recons
-0.70
bley
-0.69
pter
-0.68
heed
-0.67
nesia
-0.66
ratulations
-0.66
inished
-0.65
POSITIVE LOGITS
ËĪ
1.12
Shutterstock
1.11
Via
1.00
RTX
0.86
AFP
0.78
Flickr
0.76
Film
0.75
Picture
0.75
Editing
0.75
pione
0.74
Activations Density 0.010%