INDEX
Explanations
strong negations or refusals
New Auto-Interp
Negative Logits
Bourgoin
-0.89
PyTuple
-0.69
Amalia
-0.66
gangen
-0.66
cref
-0.65
หวัด
-0.64
condamné
-0.62
Constitucional
-0.61
contribué
-0.60
devons
-0.59
POSITIVE LOGITS
no
1.28
No
1.09
nessun
1.02
nessuna
1.02
NO
0.96
No
0.96
any
0.91
Kein
0.91
ningún
0.88
ника
0.86
Activations Density 0.089%