INDEX
Explanations
button-related elements in the text
New Auto-Interp
Negative Logits
eX
-0.87
Puck
-0.74
CLE
-0.69
tation
-0.69
This
-0.68
That
-0.66
buttonShape
-0.65
Tung
-0.65
%%%%%%%%%%%%%%%%
-0.65
ture
-0.65
POSITIVE LOGITS
btn
1.16
########.
1.00
httphttps
0.97
btn
0.93
Nors
0.91
Schulte
0.91
Yeats
0.91
Btn
0.86
McMillan
0.85
vs
0.84
Activations Density 0.054%