INDEX
Explanations
phrases expressing opinions or critiques
New Auto-Interp
Negative Logits
šk
-0.15
leftright
-0.14
ÑĢоÑģÑĤо
-0.14
$_['
-0.13
â̦)↵↵
-0.13
edb
-0.13
'],['
-0.13
[Index
-0.13
],&
-0.13
bilt
-0.13
POSITIVE LOGITS
)
0.18
]
0.16
-)
0.16
[,]
0.15
&)
0.14
udge
0.14
depending
0.14
ispens
0.14
?)
0.13
EGIN
0.13
Activations Density 0.129%