INDEX
Explanations
references to social or political equality issues
New Auto-Interp
Head Attr Weights
0:0.02
1:0.37
2:0.07
3:0.03
4:0.02
5:0.06
6:0.09
7:0.05
8:0.06
9:0.05
10:0.07
11:0.06
Negative Logits
DragonMagazine
-1.76
ertodd
-1.71
NetMessage
-1.69
natureconservancy
-1.65
ACTIONS
-1.53
bley
-1.45
Ended
-1.44
ministic
-1.43
FTWARE
-1.43
prototype
-1.40
POSITIVE LOGITS
]
4.23
?]
4.09
']
4.01
]:
4.00
]"
3.93
…]
3.89
],
3.84
].
3.81
!]
3.76
.]
3.61
Activations Density 0.106%