INDEX
Explanations
proper nouns related to specific locations or people
references to locations and notable figures
New Auto-Interp
Negative Logits
ibaba
-0.70
unks
-0.68
ez
-0.64
ranged
-0.63
ãĥĬ
-0.63
imony
-0.62
ãĥīãĥ©ãĤ´ãĥ³
-0.62
zai
-0.62
unk
-0.62
yk
-0.61
POSITIVE LOGITS
C
2.24
C
1.92
c
1.59
Cs
1.41
c
1.40
Ce
1.29
Ci
1.24
CK
1.23
CIS
1.23
CCC
1.23
Activations Density 0.745%