INDEX
Explanations
references to significant proportions or majorities in various contexts
New Auto-Interp
Negative Logits
sometimes
-0.17
slightest
-0.16
sometimes
-0.16
occasionally
-0.16
smallest
-0.15
Sometimes
-0.14
uy
-0.13
enet
-0.13
XL
-0.13
icks
-0.13
POSITIVE LOGITS
majority
0.93
Majority
0.68
болÑĮÑĪин
0.61
ëĮĢë¶Ģë¶Ħ
0.60
most
0.59
meisten
0.59
vÄĽtÅ¡
0.57
mostly
0.54
mayorÃŃa
0.52
mostly
0.47
Activations Density 0.719%