INDEX
Explanations
proper nouns with 'yr' followed by a number
New Auto-Interp
Negative Logits
Ou
-0.84
Papua
-0.71
leeve
-0.70
ELS
-0.66
seams
-0.66
Sahara
-0.66
oise
-0.65
Chao
-0.64
Pixie
-0.63
Samoa
-0.63
POSITIVE LOGITS
rha
1.26
rr
0.90
rh
0.86
andom
0.85
azines
0.82
haps
0.81
mph
0.80
umbn
0.79
hon
0.77
umph
0.76
Activations Density 0.021%