INDEX
Negative Logits
paradox
-0.07
causal
-0.07
_inc
-0.07
_assignment
-0.07
challenge
-0.07
Challenge
-0.07
graded
-0.07
hardware
-0.07
challenges
-0.06
Sponsor
-0.06
POSITIVE LOGITS
polite
0.17
politely
0.14
polit
0.07
courteous
0.07
diplomatic
0.07
İ
0.06
��
0.06
Phones
0.06
lara
0.06
diye
0.06
Activations Density 0.004%