INDEX
Explanations
phrases indicating immediacy or current relevance
New Auto-Interp
Head Attr Weights
0:0.01
1:0.03
2:0.11
3:0.03
4:0.02
5:0.04
6:0.15
7:0.23
8:0.05
9:0.08
10:0.07
11:0.14
Negative Logits
breakup
-1.11
mask
-1.06
hid
-1.06
formance
-0.97
acronym
-0.96
urat
-0.95
triangle
-0.94
ミ
-0.93
arb
-0.93
duty
-0.91
POSITIVE LOGITS
seekers
1.06
zyk
1.01
OIL
1.00
wat
0.96
SPONSORED
0.94
sear
0.90
archaeologists
0.90
sql
0.90
Wik
0.90
emis
0.89
Activations Density 0.017%