INDEX
Explanations
references to personal reflections and social commentary
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.06
3:0.12
4:0.06
5:0.04
6:0.04
7:0.04
8:0.04
9:0.08
10:0.33
11:0.10
Negative Logits
dominates
-1.66
forms
-1.57
Sharia
-1.56
regulates
-1.44
threatens
-1.34
violates
-1.31
governs
-1.31
Boko
-1.31
menace
-1.30
Instit
-1.29
POSITIVE LOGITS
went
1.55
undreds
1.52
��
1.49
ted
1.48
blogging
1.45
irtual
1.43
iphany
1.43
myself
1.35
aptop
1.31
FINE
1.31
Activations Density 0.867%