INDEX
Explanations
instances of the word "arrow"
references to arrows or arrow-related imagery
New Auto-Interp
Negative Logits
ĸļ
-0.88
oppable
-0.69
ĨĴ
-0.68
inen
-0.67
zsche
-0.64
hya
-0.63
Marxism
-0.63
aloud
-0.62
urity
-0.62
ometimes
-0.61
POSITIVE LOGITS
head
1.11
heads
1.07
heading
0.92
arrow
0.92
arrows
0.89
smith
0.82
prope
0.79
fish
0.78
asso
0.78
headed
0.78
Activations Density 0.039%