INDEX
Explanations
phrases indicating frequency or habitual actions
New Auto-Interp
Negative Logits
ειν
-0.65
WriteLine
-0.64
Potsdam
-0.63
Descubre
-0.60
!("{-0.59
classnames
-0.58
אד
-0.57
thoát
-0.56
iParam
-0.55
flik
-0.55
POSITIVE LOGITS
solito
1.05
usually
0.95
Usual
0.94
Usually
0.92
Usually
0.87
propOrder
0.84
üb
0.83
Typically
0.81
UALLY
0.81
suele
0.81
Activations Density 0.088%