INDEX
Explanations
proper nouns, particularly names of people and organizations
New Auto-Interp
Negative Logits
mai
-0.14
vor
-0.14
ep
-0.13
raison
-0.13
endregion
-0.13
frog
-0.13
Gordon
-0.13
ume
-0.13
Mai
-0.13
Â
-0.13
POSITIVE LOGITS
's
0.28
’s
0.25
ìĿĺ
0.20
çļĦ
0.19
çļĦæĥħ
0.19
'options
0.18
çļĦæīĭ
0.17
çļĦå°ı
0.16
’S
0.16
heimer
0.16
Activations Density 0.179%