INDEX
Explanations
originates or causes development
New Auto-Interp
Negative Logits
па
0.87
on
0.82
ти
0.76
nament
0.75
িং
0.73
on
0.73
بیاکت
0.72
ură
0.68
بیاکتنې
0.68
ပြည်
0.68
POSITIVE LOGITS
</h3>
0.82
denna
0.80
k
0.79
지
0.78
</h2>
0.77
denne
0.75
0.75
ش
0.75
H
0.75
어
0.73
Activations Density 0.014%