INDEX
Negative Logits
$\
0.47
ă
0.45
\
0.45
\#
0.40
âh
0.39
…
0.39
concession
0.38
Â
0.38
</b>
0.38
Junction
0.38
POSITIVE LOGITS
ेलकम
0.53
Pickett
0.48
☜
0.48
ფილი
0.46
经典的
0.46
驭
0.46
behö
0.45
upd
0.45
োজনের
0.45
펭
0.45
Activations Density 0.005%
$\
ă
\
\#
âh
…
concession
Â
</b>
Junction
ेलकम
Pickett
☜
ფილი
经典的
驭
behö
upd
োজনের
펭