INDEX
Explanations
phrases related to significant events or notable individuals
New Auto-Interp
Negative Logits
nz
-0.17
gend
-0.16
å
-0.15
è§ĦèĮĥ
-0.14
buch
-0.14
æk
-0.14
аÑĢаÑĤ
-0.14
kø
-0.14
ún
-0.13
пов
-0.13
POSITIVE LOGITS
Swedish
0.47
Sweden
0.46
Stockholm
0.45
Sweden
0.41
Svens
0.36
stockholm
0.36
svenska
0.29
Lund
0.28
.se
0.27
Anders
0.26
Activations Density 0.224%