INDEX
Explanations
proper nouns, especially names and abbreviations
references to a specific entity or organization
New Auto-Interp
Negative Logits
encour
-0.71
Granger
-0.68
creen
-0.66
Sorceress
-0.62
Xperia
-0.61
Garc
-0.61
cov
-0.61
colourful
-0.60
slack
-0.60
Santos
-0.59
POSITIVE LOGITS
senal
1.45
ansom
1.22
abbit
1.13
haps
1.08
agnar
1.07
acing
1.07
itchie
1.06
OUP
1.05
ICH
1.02
utherford
1.02
Activations Density 0.042%