INDEX
Explanations
instances of the word "that."
New Auto-Interp
Head Attr Weights
0:0.11
1:0.16
2:0.03
3:0.04
4:0.03
5:0.25
6:0.03
7:0.03
8:0.06
9:0.07
10:0.07
11:0.05
Negative Logits
版
-1.79
assetsadobe
-1.77
istg
-1.62
photoc
-1.57
suspended
-1.56
disav
-1.51
intermedi
-1.50
telev
-1.49
wrists
-1.49
swast
-1.46
POSITIVE LOGITS
col
2.03
hing
1.89
1.89
________________
1.82
rez
1.81
zy
1.79
1.76
SPONSORED
1.76
Dar
1.75
·
1.74
Activations Density 0.004%