INDEX
Explanations
proper nouns and names related to notable individuals and places
New Auto-Interp
Negative Logits
ari
-0.16
adro
-0.15
cent
-0.14
Schn
-0.14
iq
-0.14
ampil
-0.14
%E
-0.14
eldon
-0.14
ordes
-0.14
ÑĦак
-0.13
POSITIVE LOGITS
ortal
0.17
otherwise
0.16
swire
0.16
çĶ
0.15
ilda
0.15
_HS
0.15
otherwise
0.15
deo
0.14
üre
0.14
Insensitive
0.14
Activations Density 0.497%