INDEX
Explanations
specific symbols or characters
symbols or characters that represent intensity or urgency
New Auto-Interp
Negative Logits
Tob
-0.73
Obi
-0.69
interf
-0.68
Bent
-0.67
transc
-0.67
phen
-0.66
Brist
-0.65
Tek
-0.65
bes
-0.64
Sok
-0.64
POSITIVE LOGITS
Ļ
1.78
¬
1.34
ª
1.32
¡
1.26
ı
1.24
ħ
1.24
į
1.20
«
1.18
µ
1.16
Ń
1.16
Activations Density 0.450%