INDEX
Explanations
the word "avoided" with varying degrees of emphasis
instances of the word "avoided" and related terms
New Auto-Interp
Negative Logits
iop
-1.02
ingham
-0.99
alyst
-0.76
ocrats
-0.71
odore
-0.68
eds
-0.68
halls
-0.67
opter
-0.67
iate
-0.66
essee
-0.66
POSITIVE LOGITS
avoided
1.09
avoids
0.98
avoid
0.97
avoiding
0.90
avoidance
0.87
aver
0.84
avoid
0.81
Avoid
0.80
vana
0.78
icho
0.77
Activations Density 0.009%