INDEX
Explanations
references to specific capitalized words or proper nouns within the text
New Auto-Interp
Negative Logits
ritch
-0.66
odox
-0.66
anted
-0.64
verified
-0.63
qualified
-0.63
tmp
-0.62
Canaver
-0.61
pta
-0.61
acqu
-0.61
yrim
-0.61
POSITIVE LOGITS
(>
0.82
Thumbnail
0.76
daddy
0.70
gie
0.68
————————
0.68
buster
0.67
lumber
0.67
scoop
0.65
TVs
0.64
swoop
0.64
Activations Density 0.100%