INDEX
Negative Logits
console
-0.08
qualifier
-0.08
has
-0.07
spec
-0.07
specif
-0.07
teht
-0.07
assistant
-0.07
features
-0.07
ranking
-0.07
description
-0.07
POSITIVE LOGITS
victims
0.09
侵犯
0.09
celebrities
0.09
Brennan
0.09
pornography
0.09
Arbitration
0.09
offend
0.08
blancos
0.08
reef
0.08
�
0.08
Activations Density 0.004%