INDEX
Explanations
interrogative words and phrases, particularly questions or inquiries
New Auto-Interp
Head Attr Weights
0:0.02
1:0.03
2:0.12
3:0.13
4:0.01
5:0.03
6:0.04
7:0.10
8:0.06
9:0.19
10:0.10
11:0.13
Negative Logits
hement
-1.41
eries
-1.34
geoning
-1.32
uably
-1.30
vir
-1.28
renched
-1.25
alia
-1.19
raught
-1.18
pmwiki
-1.17
rolled
-1.17
POSITIVE LOGITS
anymore
1.50
behavi
1.29
newsp
1.24
terminology
1.23
myself
1.22
nor
1.16
whereabouts
1.15
Palestin
1.15
explan
1.12
Doodle
1.11
Activations Density 0.033%