INDEX
Explanations
phrases emphasizing detailed explanations or inquiries about specific subjects
New Auto-Interp
Head Attr Weights
0:0.04
1:0.02
2:0.10
3:0.04
4:0.30
5:0.06
6:0.02
7:0.02
8:0.06
9:0.22
10:0.05
11:0.02
Negative Logits
gasp
-1.44
assador
-1.34
ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ
-1.33
interstitial
-1.28
Adin
-1.28
anova
-1.23
rals
-1.22
ush
-1.21
Sidney
-1.21
inia
-1.20
POSITIVE LOGITS
��
1.82
Transcript
1.34
unfolding
1.30
detail
1.29
detailing
1.29
erous
1.28
ascript
1.27
��
1.26
Firearms
1.26
hyde
1.26
Activations Density 0.017%