INDEX
Explanations
medical terms and location names related to specific countries
proper nouns related to places and institutions
New Auto-Interp
Negative Logits
xp
-0.59
pak
-0.59
tick
-0.57
clicks
-0.57
stakes
-0.57
caps
-0.57
SPACE
-0.57
crop
-0.56
haste
-0.55
space
-0.55
POSITIVE LOGITS
elaide
0.89
odore
0.88
xon
0.84
west
0.83
oldemort
0.77
agall
0.76
rimination
0.75
endment
0.75
bnb
0.75
alyst
0.75
Activations Density 0.205%