INDEX
Explanations
phrases indicating the prevalence or majority within groups
New Auto-Interp
Negative Logits
perbedaan
-0.56
perpétu
-0.52
urra
-0.51
ipot
-0.51
druk
-0.51
regalo
-0.50
adanya
-0.49
!("{-0.49
essä
-0.48
tegas
-0.47
POSITIVE LOGITS
meisten
0.99
TagMode
0.88
majority
0.85
большинство
0.85
maioria
0.81
plupart
0.81
большин
0.80
Majority
0.78
UnusedPrivate
0.76
majority
0.75
Activations Density 0.341%