INDEX
Explanations
names of places, potentially with non-English characters, or references to events or people in a specific context
special characters or symbols used in the text
New Auto-Interp
Negative Logits
izable
-0.83
nels
-0.69
giveaways
-0.64
mort
-0.63
stocks
-0.61
goodwill
-0.61
fell
-0.61
mutual
-0.61
rooting
-0.58
ilater
-0.58
POSITIVE LOGITS
Ä
1.42
ĥ
1.31
Ľ
1.25
ħ
1.24
ķ
1.23
¼
1.21
Ģ
1.20
ı
1.19
ij
1.17
ĸ
1.17
Activations Density 0.006%