INDEX
Explanations
references to specific individuals or personal identifiers
New Auto-Interp
Negative Logits
ISI
-0.20
(IC
-0.17
(ID
-0.16
ICT
-0.16
ó
-0.15
=in
-0.15
еÑģÑĮ
-0.15
±Ð¾ÑĤ
-0.15
/in
-0.15
(IS
-0.15
POSITIVE LOGITS
Ä«
0.41
×Ļ×
0.40
ÃŃ
0.39
İ
0.37
ï
0.36
ãĤ¤
0.35
Ñĸ
0.34
ì
0.34
ÂŃi
0.34
িà¦
0.34
Activations Density 0.514%