INDEX
Explanations
phrases indicating descriptions and evaluations of experiences or situations
New Auto-Interp
Negative Logits
ëĿ¼ëıĦ
-0.15
himself
-0.14
ìŀĪëĭ¤ëĬĶ
-0.13
δά
-0.13
antz
-0.13
enumerator
-0.13
upert
-0.13
å¯Ł
-0.13
punt
-0.12
плаÑģÑĤи
-0.12
POSITIVE LOGITS
"
0.28
«
0.25
"[
0.24
“
0.23
“[
0.22
'
0.21
``
0.21
ãĢĮ
0.19
'[
0.19
"(
0.18
Activations Density 0.177%