INDEX
Explanations
words related to horses
references to horses
New Auto-Interp
Negative Logits
iaries
-0.98
âĶģ
-0.82
unda
-0.79
licted
-0.74
newsp
-0.72
iary
-0.71
lections
-0.69
apse
-0.69
iance
-0.69
usal
-0.68
POSITIVE LOGITS
meat
1.09
manship
1.04
Horses
1.00
poke
1.00
men
0.98
horses
0.97
fish
0.95
horse
0.95
women
0.94
wright
0.93
Activations Density 0.023%