INDEX
Explanations
proper nouns
proper nouns, particularly names and titles
New Auto-Interp
Negative Logits
mble
-0.62
etheless
-0.60
ģĸ
-0.59
duplicate
-0.57
dism
-0.55
iculty
-0.54
redistributed
-0.54
sembly
-0.53
dashed
-0.52
merce
-0.52
POSITIVE LOGITS
Fest
0.72
Coin
0.68
Ps
0.67
ratom
0.67
Wiki
0.65
Street
0.62
Avenue
0.62
coin
0.61
Era
0.60
inia
0.60
Activations Density 0.607%