INDEX
Explanations
evidence of effectiveness or success in various contexts
New Auto-Interp
Negative Logits
(
-0.68
McCarthy
-0.58
/
-0.57
-0.56
Katz
-0.55
(
-0.54
Katz
-0.53
[
-0.52
co
-0.51
f
-0.50
POSITIVE LOGITS
proved
1.79
proves
1.74
Prove
1.63
prove
1.58
proving
1.56
proven
1.52
Proving
1.52
prove
1.46
Prove
1.43
proved
1.42
Activations Density 0.137%