INDEX
Explanations
references to decision-making processes, particularly in cooperative contexts or game theory scenarios
New Auto-Interp
Head Attr Weights
0:0.08
1:0.04
2:0.09
3:0.26
4:0.04
5:0.12
6:0.04
7:0.08
8:0.02
9:0.04
10:0.10
11:0.03
Negative Logits
churches
-2.73
colleges
-2.56
universities
-2.54
studios
-2.50
laboratories
-2.49
Churches
-2.42
dictators
-2.41
Universities
-2.37
campuses
-2.34
sects
-2.29
POSITIVE LOGITS
clicked
2.19
Previous
2.19
ancel
2.18
incoming
2.15
downed
2.14
item
2.14
sent
2.13
requested
2.13
Done
2.12
Result
2.10
Activations Density 0.597%