INDEX
Explanations
locations or proper nouns
geographic or organizational names and terms
New Auto-Interp
Negative Logits
similar
-0.71
warranty
-0.68
alt
-0.64
overw
-0.64
visible
-0.62
routine
-0.62
concealed
-0.62
masked
-0.62
prior
-0.62
worthwhile
-0.61
POSITIVE LOGITS
assian
1.13
elia
1.08
leton
1.08
ley
1.00
ena
1.00
hani
1.00
ham
0.98
hart
0.97
ayne
0.96
ford
0.96
Activations Density 0.187%