INDEX
Explanations
phrases indicating large quantities or proportions
New Auto-Interp
Negative Logits
-0.72
-0.63
TAWA
-0.62
Paglinawan
-0.59
μφωνα
-0.58
KEYCODE
-0.58
препратки
-0.58
stepwise
-0.58
μως
-0.58
AndEndTag
-0.55
POSITIVE LOGITS
majority
1.02
many
0.95
maioria
0.90
majority
0.87
plupart
0.83
çoğu
0.82
大部分
0.81
majorité
0.81
большин
0.80
MANY
0.79
Activations Density 0.373%