INDEX
Explanations
referencing and qualifying statements related to particular contexts in discussions about behavior or actions
New Auto-Interp
Negative Logits
them
-0.82
Them
-0.80
THEM
-0.78
őket
-0.69
Them
-0.67
henne
-0.67
honom
-0.66
これを
-0.65
jambes
-0.63
lui
-0.63
POSITIVE LOGITS
there
1.62
they
1.18
we
1.10
the
1.02
there
0.95
individuals
0.85
)";
0.85
certain
0.84
)");
0.84
it
0.82
Activations Density 1.331%