INDEX
Explanations
text in a non-Latin script, possibly indicating foreign language content or encoding issues
New Auto-Interp
Negative Logits
ÄĽr
-0.18
ÑįÑĦÑĦек
-0.15
bas
-0.15
¯
-0.14
段
-0.14
инÑĦоÑĢма
-0.13
possessions
-0.13
ì¹Ļ
-0.13
Sibling
-0.13
ingham
-0.13
POSITIVE LOGITS
ÂŃi
0.17
λιά
0.15
Beckham
0.15
od
0.15
Äįlov
0.14
og
0.14
vÃŃ
0.14
ãĥ¼ãĤ¹ãĥĪ
0.13
ÑĸнÑĮ
0.13
ãģªãĤĵãģ¦
0.13
Activations Density 0.156%