INDEX
Negative Logits
Y
0.37
Tanh
0.34
erste
0.33
E
0.32
A
0.32
PI
0.31
خاصيه
0.31
Eigenschaften
0.30
ανε
0.30
款
0.30
POSITIVE LOGITS
at
0.43
helping
0.39
facilitating
0.36
promoting
0.35
銠
0.35
isted
0.34
enjoying
0.33
listening
0.33
ruiting
0.33
且
0.33
Activations Density 0.137%