INDEX
Explanations
terms related to law enforcement and legal cases
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.08
3:0.12
4:0.37
5:0.03
6:0.03
7:0.09
8:0.03
9:0.03
10:0.08
11:0.04
Negative Logits
��
-1.89
Instruct
-1.72
inas
-1.59
Parables
-1.57
united
-1.57
icult
-1.51
pione
-1.46
Engineers
-1.46
eret
-1.45
irled
-1.42
POSITIVE LOGITS
altogether
3.19
entirely
2.08
remaining
1.90
indefinitely
1.88
outright
1.87
reliance
1.85
penalties
1.77
entirety
1.76
offending
1.75
needless
1.74
Activations Density 0.331%