INDEX
Explanations
image captions that include technical instructions
prompts to enlarge content or toggle visibility options in media
New Auto-Interp
Negative Logits
angers
-0.63
angering
-0.63
anger
-0.63
upt
-0.62
cardinal
-0.60
uality
-0.60
esses
-0.60
crowds
-0.59
fulness
-0.58
stood
-0.58
POSITIVE LOGITS
ONSORED
0.94
Enlarge
0.94
WATCHED
0.88
UTERS
0.81
caption
0.76
toggle
0.75
"$:/
0.75
Image
0.73
hran
0.72
Flavoring
0.71
Activations Density 0.012%