INDEX
Explanations
mentions of social media platforms
special characters or unusual symbols in the text
New Auto-Interp
Negative Logits
scattering
-0.81
scatter
-0.74
theless
-0.70
Counsel
-0.69
princ
-0.67
semic
-0.65
Corinth
-0.64
diffusion
-0.63
Afric
-0.63
iewicz
-0.61
POSITIVE LOGITS
į
1.02
Į
0.97
¹
0.96
¬
0.96
§
0.94
º
0.94
½
0.92
¿
0.92
ı
0.91
¡
0.89
Activations Density 0.475%