INDEX
Explanations
references to various forms of law or legal frameworks
New Auto-Interp
Head Attr Weights
0:0.05
1:0.08
2:0.14
3:0.03
4:0.05
5:0.09
6:0.08
7:0.09
8:0.12
9:0.05
10:0.09
11:0.09
Negative Logits
uph
-0.99
urden
-0.89
Hobby
-0.89
peak
-0.83
Euph
-0.81
��
-0.81
minded
-0.78
iter
-0.78
iche
-0.75
rendered
-0.73
POSITIVE LOGITS
Restaur
0.92
oğan
0.88
jamin
0.84
yang
0.83
obyl
0.82
engeance
0.81
glas
0.81
riott
0.79
roma
0.79
roversial
0.79
Activations Density 0.551%