INDEX
Explanations
names of people and notable figures
New Auto-Interp
Negative Logits
pest
-0.19
ening
-0.17
enna
-0.15
Lena
-0.15
acity
-0.14
bs
-0.14
$$$$
-0.14
gos
-0.14
REN
-0.14
eman
-0.14
POSITIVE LOGITS
yw
0.17
isci
0.16
eÄį
0.15
asio
0.15
istrovstvÃŃ
0.15
ÙĪØ§Ø±Ùĩ
0.15
hum
0.15
cy
0.15
woord
0.14
nem
0.14
Activations Density 0.221%