INDEX
Explanations
proper nouns related to notable individuals or locations
references to specific geographic locations or entities
New Auto-Interp
Negative Logits
urat
-0.79
perty
-0.77
arak
-0.74
yip
-0.74
odan
-0.72
atro
-0.68
Mehran
-0.67
indo
-0.66
wcs
-0.66
uras
-0.64
POSITIVE LOGITS
sworth
0.85
neum
0.73
gins
0.71
è£ıè
0.69
beck
0.68
sted
0.68
hyde
0.67
Henry
0.66
enegger
0.65
ettes
0.64
Activations Density 0.213%