INDEX
Explanations
questions exploring decision-making and critical thinking
New Auto-Interp
Negative Logits
it
-0.67
the
-0.67
they
-0.67
we
-0.59
getItemId
-0.57
he
-0.56
neither
-0.54
you
-0.54
mel
-0.51
они
-0.51
POSITIVE LOGITS
lesquelles
0.74
What
0.73
Whom
0.71
What
0.69
belangrij
0.69
الإنجليزية
0.68
what
0.67
Which
0.67
copg
0.66
astéroïdes
0.65
Activations Density 0.187%