INDEX
Explanations
words related to errors or failures
references to mistakes and errors
New Auto-Interp
Negative Logits
wired
-0.74
hibit
-0.70
rer
-0.69
rix
-0.69
cend
-0.68
fitted
-0.65
lain
-0.65
skin
-0.65
icum
-0.63
Supp
-0.63
POSITIVE LOGITS
mistakes
3.86
errors
2.59
mistake
2.05
flaws
1.93
faults
1.92
failings
1.89
errors
1.88
Errors
1.86
shortcomings
1.83
failures
1.83
Activations Density 0.027%