INDEX
Explanations
random characters and words that appear to have no clear relation to a specific pattern or theme
specific formatting or notation styles used in numerical representations
New Auto-Interp
Negative Logits
wagen
-0.69
ritic
-0.69
aceutical
-0.68
sucker
-0.67
stitial
-0.65
ydia
-0.64
nces
-0.64
rod
-0.64
cki
-0.64
punk
-0.63
POSITIVE LOGITS
ÃĹ
1.12
âĢ¢âĢ¢âĢ¢âĢ¢
1.03
âĢ¢âĢ¢
0.92
++++++++++++++++
0.78
++++++++
0.74
_>
0.74
-+-+
0.72
··
0.69
ï¸
0.67
ÃĹ
0.65
Activations Density 0.009%