INDEX
Negative Logits
love
0.74
card
0.73
circumstances
0.73
parking
0.73
ẹp
0.72
خراب
0.71
practice
0.69
आप
0.69
াব্দ
0.69
iful
0.65
POSITIVE LOGITS
Statement
0.90
والع
0.82
atoti
0.77
屒
0.76
समेत
0.76
';"+
0.76
㐱
0.75
<unused87>
0.74
țiunea
0.74
ši
0.74
Activations Density 0.003%