INDEX
Negative Logits
NaN
0.46
郯
0.42
ッフ
0.40
परवानगी
0.40
અનુ
0.40
Unauthorized
0.40
Unauthorized
0.40
窕
0.40
ներ
0.39
สิน
0.39
POSITIVE LOGITS
synonymous
0.48
cinta
0.44
스의
0.44
carrots
0.43
sadde
0.43
↵↵
0.42
aldığı
0.42
sette
0.42
apal
0.41
otto
0.41
Activations Density 0.006%