INDEX
Explanations
references to investigations or inquiries, particularly those involving scrutiny or questioning
New Auto-Interp
Head Attr Weights
0:0.06
1:0.03
2:0.13
3:0.06
4:0.10
5:0.07
6:0.03
7:0.03
8:0.13
9:0.20
10:0.06
11:0.03
Negative Logits
weet
-1.39
ahime
-1.25
natureconservancy
-1.22
olson
-1.19
alli
-1.17
uphem
-1.13
izabeth
-1.09
undle
-1.09
eva
-1.07
appl
-1.07
POSITIVE LOGITS
probing
1.55
Probe
1.38
probes
1.24
probe
1.24
Collider
1.21
uncover
1.14
Leaks
1.10
��
1.07
NRS
1.07
Plat
1.06
Activations Density 0.003%