INDEX
Explanations
abstract concepts related to identity and existence
New Auto-Interp
Head Attr Weights
0:0.06
1:0.06
2:0.10
3:0.05
4:0.03
5:0.09
6:0.06
7:0.11
8:0.07
9:0.08
10:0.15
11:0.08
Negative Logits
conclud
-1.44
advoc
-0.99
romeda
-0.99
reluct
-0.98
turnout
-0.96
jug
-0.96
scrut
-0.95
manufact
-0.95
predec
-0.95
ernel
-0.92
POSITIVE LOGITS
enei
1.08
natureconservancy
1.07
agara
1.04
�
1.03
SU
1.02
ㅋ
1.01
urnal
0.97
phan
0.96
wered
0.96
adena
0.95
Activations Density 0.188%