INDEX
Explanations
prepositions or phrases indicating purpose or relationship
New Auto-Interp
Head Attr Weights
0:0.02
1:0.04
2:0.08
3:0.29
4:0.02
5:0.03
6:0.06
7:0.11
8:0.06
9:0.07
10:0.09
11:0.08
Negative Logits
abbling
-1.23
ield
-1.23
opter
-1.15
iflower
-1.13
esters
-1.13
NAS
-1.13
Omaha
-1.13
dies
-1.10
OGR
-1.09
ificial
-1.07
POSITIVE LOGITS
propri
1.34
probabilities
1.34
REDACTED
1.33
timestamp
1.31
perspect
1.30
particulars
1.29
totality
1.25
conclud
1.25
derog
1.22
ali
1.20
Activations Density 0.012%