INDEX
Explanations
occurrences of specific terms related to legal or criminal actions
New Auto-Interp
Head Attr Weights
0:0.11
1:0.07
2:0.07
3:0.08
4:0.04
5:0.11
6:0.07
7:0.03
8:0.07
9:0.14
10:0.09
11:0.07
Negative Logits
scrut
-1.67
htaking
-1.47
sustainability
-1.43
behavi
-1.33
incent
-1.27
mitigation
-1.26
lifes
-1.25
newsp
-1.24
welf
-1.23
lapt
-1.22
POSITIVE LOGITS
/,
1.63
-.
1.51
/.
1.49
([
1.41
�
1.40
.[
1.35
/?
1.34
�
1.34
/"
1.33
�
1.32
Activations Density 0.029%