INDEX
Explanations
country names and related entities
New Auto-Interp
Negative Logits
wcs
-0.76
ufact
-0.75
afety
-0.71
rss
-0.70
osaurus
-0.70
elist
-0.70
xon
-0.69
zsche
-0.68
oslav
-0.68
gow
-0.67
POSITIVE LOGITS
riches
0.68
fortune
0.59
ãĥ¼ãĥĨ
0.55
iggs
0.51
Maxwell
0.51
fortunes
0.50
Divine
0.50
Fortune
0.49
ÃĤ
0.48
±
0.48
Activations Density 2.179%