INDEX
Negative Logits
distrust
-0.07
_expected
-0.07
grandi
-0.07
.currency
-0.06
propaganda
-0.06
değişiklik
-0.06
건
-0.06
olmak
-0.06
.Prop
-0.06
.Port
-0.06
POSITIVE LOGITS
gymn
0.07
rationale
0.06
finale
0.06
Mutation
0.06
exclusively
0.06
instructor
0.06
fusc
0.06
زة
0.06
selective
0.06
Clr
0.06
Activations Density 0.026%