INDEX
Explanations
instances where something is demonstrated or presented
instances of the word "showing" in various contexts
New Auto-Interp
Negative Logits
tymology
-0.83
undai
-0.82
etheless
-0.81
rit
-0.78
cult
-0.78
perty
-0.72
orrow
-0.71
rafted
-0.68
lake
-0.67
tar
-0.67
POSITIVE LOGITS
Hide
0.91
shows
0.87
imony
0.83
-+-+
0.81
attRot
0.81
SHOW
0.80
Redditor
0.79
Show
0.78
Shows
0.77
ered
0.77
Activations Density 0.009%