INDEX
Negative Logits
iktar
0.86
సమావేశ
0.80
갠
0.76
шар
0.76
خصص
0.75
লাষ
0.73
fornisce
0.72
шую
0.72
御
0.72
dùng
0.72
POSITIVE LOGITS
démon
0.56
dis
0.53
People
0.53
switch
0.53
short
0.51
flag
0.51
changed
0.51
instructions
0.51
sections
0.51
ulfillment
0.50
Activations Density 0.003%