INDEX
Explanations
references to personal pronouns indicating relationships and interactions
New Auto-Interp
Negative Logits
citoyen
-0.33
botines
-0.32
Autres
-0.31
bañ
-0.31
fijo
-0.30
sauvage
-0.29
muere
-0.28
tatuajes
-0.28
colegios
-0.28
libremente
-0.28
POSITIVE LOGITS
<unused41>
1.04
[@BOS@]
1.03
<unused8>
1.03
<unused23>
1.03
<unused51>
1.03
<pad>
1.03
<unused3>
1.03
<unused16>
1.03
<unused52>
1.03
<unused14>
1.03
Activations Density 0.023%