INDEX
Explanations
numerical values embedded within text
numerical identifiers or values associated with images or captions
New Auto-Interp
Negative Logits
graduate
-0.67
sed
-0.66
itionally
-0.65
lag
-0.62
pick
-0.62
development
-0.61
shapeshifter
-0.61
aimon
-0.61
rugged
-0.61
termin
-0.60
POSITIVE LOGITS
Photos
0.87
embed
0.74
IMAGES
0.72
caption
0.71
Pict
0.69
toggle
0.68
Cosponsors
0.68
embed
0.67
depiction
0.67
Thumbnails
0.67
Activations Density 0.046%