INDEX
Explanations
references to ownership or possession
New Auto-Interp
Negative Logits
↵↵
-0.66
.
-0.64
<eos>
-0.64
and
-0.63
↵
-0.61
'
-0.61
-0.57
1
-0.56
et
-0.56
-0.55
POSITIVE LOGITS
próprio
1.45
own
1.44
próprios
1.36
próprias
1.32
Own
1.31
propia
1.29
itſelf
1.28
própria
1.28
egne
1.28
propio
1.27
Activations Density 0.223%