INDEX
Explanations
words related to collective or shared experiences
New Auto-Interp
Negative Logits
they
-0.49
she
-0.46
вони
-0.46
they
-0.41
BOTH
-0.41
они
-0.40
onlar
-0.39
horizontally
-0.38
Both
-0.38
ella
-0.38
POSITIVE LOGITS
its
1.10
how
0.91
their
0.88
její
0.73
cómo
0.73
seus
0.71
bagaimana
0.70
related
0.68
ihrer
0.68
suas
0.66
Activations Density 0.631%