INDEX
Negative Logits
Cum
-0.07
内
-0.06
-X
-0.06
human
-0.06
弥
-0.06
Poll
-0.06
stretched
-0.06
Fn
-0.06
-high
-0.06
|--------------------------------------------------------------------------↵
-0.06
POSITIVE LOGITS
тою
0.07
arte
0.06
ова
0.06
persona
0.06
activist
0.06
са
0.06
entrepreneur
0.06
.sky
0.06
simpl
0.06
_THROW
0.06
Activations Density 0.036%