INDEX
Explanations
sequences involving a specific symbol, represented in the activations by 'âĢ'
the character representation of a symbol or emoticon
New Auto-Interp
Negative Logits
Tunis
-0.77
Libyan
-0.75
Kenyan
-0.73
scattering
-0.69
guided
-0.69
diffusion
-0.67
Eisen
-0.64
guidance
-0.64
Counsel
-0.63
memorandum
-0.63
POSITIVE LOGITS
¬
1.31
¡
1.27
¹
1.27
½
1.23
«
1.21
Ń
1.21
¿
1.21
į
1.19
Į
1.19
ª
1.18
Activations Density 0.327%