INDEX
Explanations
narratives involving conflict and resolution
New Auto-Interp
Negative Logits
estão
-0.18
podem
-0.17
possono
-0.17
hacen
-0.16
hanno
-0.15
oken
-0.15
tie
-0.15
são
-0.14
SEN
-0.14
vede
-0.14
POSITIVE LOGITS
tu
0.39
fue
0.33
tu
0.30
Tu
0.29
se
0.29
Tu
0.29
dio
0.28
pudo
0.25
hizo
0.23
dio
0.23
Activations Density 0.045%