INDEX
Negative Logits
on
0.74
وم
0.66
Β
0.65
to
0.63
५
0.61
Pued
0.60
५
0.60
大脑
0.60
۴
0.59
To
0.59
POSITIVE LOGITS
,
1.05
f
1.02
i
0.95
the
0.89
ي
0.88
;
0.80
intellectually
0.79
er
0.74
?
0.74
ſe
0.73
Activations Density 0.001%
on
وم
Β
to
५
Pued
५
大脑
۴
To
,
f
i
the
ي
;
intellectually
er
?
ſe