INDEX
Explanations
phrases indicating a deliberate choice to disregard or overlook something
instances of the word "ignore."
New Auto-Interp
Negative Logits
ramer
-0.93
unal
-0.92
uliffe
-0.86
emetery
-0.81
arter
-0.80
alg
-0.75
urther
-0.74
gran
-0.74
aver
-0.73
raq
-0.73
POSITIVE LOGITS
ignore
0.89
ignores
0.82
ignoring
0.74
underestimate
0.73
aside
0.72
overlook
0.72
ignored
0.71
fulness
0.70
neglect
0.68
obe
0.65
Activations Density 0.012%