INDEX
Explanations
biographical information about individuals, including names, dates, and places
New Auto-Interp
Negative Logits
enden
-0.17
erten
-0.14
еÑĪ
-0.14
fitness
-0.14
roud
-0.13
åģ¥
-0.13
chter
-0.13
egas
-0.13
asl
-0.13
astos
-0.13
POSITIVE LOGITS
IPA
0.20
IPA
0.20
sometimes
0.18
adalah
0.18
,[
0.17
ï¼īãģ¯
0.17
[a
0.17
is
0.17
)ìĿĢ
0.16
,[
0.16
Activations Density 0.093%