INDEX
Negative Logits
mismatched
0.40
inappropriately
0.40
differing
0.37
왤
0.37
whether
0.36
assent
0.36
supplemental
0.35
banding
0.35
linkages
0.35
annoyance
0.35
POSITIVE LOGITS
Kindly
0.61
Voici
0.57
Voici
0.53
Please
0.52
Kindly
0.47
ขอ
0.47
´
0.47
пожалуйста
0.47
부탁
0.46
жалуйста
0.46
Activations Density 0.026%