INDEX
Explanations
collective communication or data collection
New Auto-Interp
Negative Logits
as
1.31
ра
1.23
are
1.15
ك
1.14
,
1.07
د
1.05
å
0.99
ian
0.94
𝙤
0.93
一個
0.92
POSITIVE LOGITS
m
1.14
<0x0D>
1.09
০০০
1.01
ní
0.99
l
0.97
mama
0.96
tio
0.93
rda
0.88
tay
0.88
thed
0.88
Activations Density 0.205%