INDEX
Negative Logits
社会
-0.08
mployee
-0.07
수행
-0.07
':'
-0.07
рем
-0.06
hated
-0.06
mp
-0.06
planting
-0.06
withheld
-0.06
responseObject
-0.06
POSITIVE LOGITS
/settings
0.07
Gover
0.06
estilo
0.06
[right
0.06
_prim
0.06
conclusions
0.06
()}</
0.06
tube
0.05
نه
0.05
каче
0.05
Activations Density 0.004%