INDEX
Negative Logits
df
0.38
bypass
0.36
^(
0.36
Bypass
0.35
Moving
0.34
ou
0.34
transmitted
0.34
Ste
0.33
def
0.33
displaystyle
0.33
POSITIVE LOGITS
into
0.71
домой
0.70
toward
0.65
hacia
0.63
towards
0.63
undertaken
0.58
INTO
0.57
menuju
0.55
туда
0.55
across
0.54
Activations Density 0.028%