INDEX
Explanations
mathematical and formal proof structures
New Auto-Interp
Negative Logits
]
-0.36
)
-0.35
],
-0.34
),
-0.33
},
-0.32
}}
-0.32
);
-0.31
}:
-0.31
}
-0.30
],↵
-0.30
POSITIVE LOGITS
}↵
0.16
}↵↵
0.16
(«
0.14
!“
0.14
Nová
0.14
ENTE
0.13
립
0.13
گز
0.13
ائر
0.12
}()↵↵
0.12
Activations Density 0.098%