INDEX
Explanations
phrases related to mistakes and errors in a medical or legal context
New Auto-Interp
Negative Logits
Ivar
-0.47
astify
-0.46
cuestion
-0.44
liek
-0.44
inner
-0.44
Denied
-0.44
issão
-0.43
intenant
-0.43
Inner
-0.42
antMatchers
-0.42
POSITIVE LOGITS
correctly
0.77
contentLoaded
0.69
wrong
0.67
malah
0.67
mistakenly
0.67
correct
0.66
correctly
0.64
wrong
0.63
juist
0.62
Wrong
0.62
Activations Density 0.178%