INDEX
Explanations
phrases related to inquiries and responses in a legal or investigative context
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.06
3:0.38
4:0.10
5:0.04
6:0.03
7:0.05
8:0.06
9:0.07
10:0.07
11:0.05
Negative Logits
,'"
-2.39
!",
-2.22
',"
-1.89
':
-1.75
,'
-1.74
…"
-1.72
..."
-1.70
.,"
-1.69
!'
-1.68
,"
-1.66
POSITIVE LOGITS
.).
2.72
).
2.23
?).
2.21
).
2.17
%).
2.03
)).
1.99
).[
1.89
]).
1.80
].
1.76
].
1.65
Activations Density 0.097%