INDEX
Explanations
words related to various societal roles and conditions
"Everyone" or similar all-encompassing terms
everyone knows
New Auto-Interp
Negative Logits
misschien
-0.64
whole
-0.64
often
-0.61
spesso
-0.59
tutto
-0.58
all
-0.55
ganzen
-0.55
måske
-0.55
muchos
-0.55
hiç
-0.54
POSITIVE LOGITS
except
1.49
except
1.40
Except
1.28
kecuali
1.26
Except
1.24
EXCEPT
1.23
excepto
1.21
đều
1.20
sauf
1.18
imaginable
1.09
Activations Density 0.611%