INDEX
Explanations
Population size and evolution
New Auto-Interp
Negative Logits
ك
1.41
۔
1.15
は
1.01
مس
1.00
یت
0.96
بد
0.91
።
0.89
נ
0.88
માં
0.88
رد
0.87
POSITIVE LOGITS
is
1.29
Population
1.13
ти
1.13
us
1.05
population
1.05
ן
1.02
те
1.01
ু
0.98
populations
0.96
0.95
Activations Density 0.014%