INDEX
Explanations
the word 'York' with varying degrees of activation
occurrences of the word "York."
New Auto-Interp
Negative Logits
cius
-0.79
eers
-0.75
anwhile
-0.73
itures
-0.70
aeda
-0.70
Flavoring
-0.70
pmwiki
-0.68
phrine
-0.68
cedes
-0.67
igmat
-0.66
POSITIVE LOGITS
mares
0.88
York
0.87
dale
0.84
Huss
0.77
stone
0.75
onne
0.73
cap
0.70
stones
0.70
Brook
0.70
468
0.70
Activations Density 0.010%