INDEX
Explanations
proper nouns and specific names related to organizations, individuals, and locations
New Auto-Interp
Negative Logits
ingle
-0.17
trif
-0.16
aan
-0.16
.strict
-0.16
esome
-0.15
aal
-0.15
idden
-0.14
599
-0.14
Bench
-0.14
emd
-0.14
POSITIVE LOGITS
andır
0.15
ucwords
0.14
chez
0.14
¬´
0.14
okane
0.14
ibr
0.14
itorio
0.14
езда
0.14
æļ®
0.13
.kr
0.13
Activations Density 0.013%