INDEX
Explanations
specific terms related to legal and administrative actions
New Auto-Interp
Head Attr Weights
0:0.03
1:0.01
2:0.03
3:0.05
4:0.03
5:0.03
6:0.11
7:0.02
8:0.03
9:0.56
10:0.02
11:0.01
Negative Logits
Elise
-3.27
onion
-3.02
reptiles
-2.96
�
-2.96
ince
-2.88
entin
-2.85
emot
-2.83
Mem
-2.81
emort
-2.75
ussen
-2.71
POSITIVE LOGITS
b
5.62
B
5.51
BI
5.48
BB
5.45
bb
5.39
bm
5.33
Bib
5.22
BA
5.21
IB
5.17
bah
5.04
Activations Density 0.382%