INDEX
Negative Logits
㪕
0.43
boutons
0.41
哏
0.40
Blocked
0.39
participa
0.39
Distortion
0.39
Checking
0.38
participe
0.38
obeys
0.38
رو
0.38
POSITIVE LOGITS
stipulates
1.09
prohibits
1.05
stip
1.05
предусматри
1.05
forbids
1.00
prohibiting
0.95
governs
0.90
governing
0.89
specifies
0.88
forbid
0.88
Activations Density 0.067%