INDEX
Negative Logits
批
-0.07
coefficient
-0.07
politically
-0.07
temporary
-0.07
엘
-0.07
hypertension
-0.06
workspace
-0.06
=============================================================================↵
-0.06
########################################################
-0.06
kênh
-0.06
POSITIVE LOGITS
gré
0.07
şiddet
0.06
sembled
0.06
veloc
0.06
terug
0.06
vie
0.06
zwarte
0.06
laughed
0.06
„J
0.06
gorge
0.06
Activations Density 0.067%