INDEX
Explanations
the word "Kyiv" spelled in different ways
references to the term "Ky."
New Auto-Interp
Negative Logits
ional
-0.89
heads
-0.78
ACTED
-0.75
IBLE
-0.71
ÙĴ
-0.71
à¨
-0.70
interface
-0.69
ãĥĩãĤ£
-0.69
bidden
-0.69
天
-0.67
POSITIVE LOGITS
Ky
0.98
olk
0.87
ewitness
0.85
kees
0.83
orea
0.83
nect
0.80
annis
0.78
£ı
0.77
ush
0.77
oshi
0.75
Activations Density 0.011%