INDEX
Explanations
phrases related to errors or mistakes
instances of the word "error" and related terms
New Auto-Interp
Negative Logits
apeake
-0.83
apy
-0.82
amen
-0.81
tsky
-0.81
electric
-0.80
nai
-0.80
bledon
-0.78
estine
-0.78
Electric
-0.76
APTER
-0.71
POSITIVE LOGITS
error
0.91
guiActiveUn
0.88
gered
0.87
margin
0.84
errors
0.84
ously
0.82
dece
0.79
mishand
0.78
deceive
0.77
message
0.74
Activations Density 0.022%