INDEX
Explanations
expressions of concern or lack of concern
don’t care about
New Auto-Interp
Negative Logits
routines
-0.42
}));
-0.42
>)
-0.41
﴿
-0.41
].)
-0.41
])]
-0.40
]));
-0.39
]));
-0.38
PPI
-0.38
}));
-0.38
POSITIVE LOGITS
cared
0.72
propOrder
0.68
UnsafeEnabled
0.66
égale
0.66
frågan
0.62
mattered
0.61
Ahnung
0.61
banget
0.61
cares
0.61
preocupar
0.61
Activations Density 0.006%