INDEX
Explanations
incorrectly spelled words
references to correctness or accuracy in actions or statements
New Auto-Interp
Negative Logits
aunt
-0.79
ires
-0.69
hunt
-0.65
assment
-0.65
ike
-0.63
Rise
-0.63
ofi
-0.59
enture
-0.58
dark
-0.58
our
-0.58
POSITIVE LOGITS
correctly
3.89
incorrectly
2.63
properly
2.37
accurately
2.25
wisely
1.90
wrongly
1.85
appropriately
1.83
errone
1.80
rightly
1.78
correct
1.69
Activations Density 0.012%