INDEX
Explanations
indicators of ongoing investigations or inquiries
New Auto-Interp
Head Attr Weights
0:0.04
1:0.03
2:0.05
3:0.13
4:0.08
5:0.04
6:0.05
7:0.03
8:0.10
9:0.07
10:0.10
11:0.22
Negative Logits
Sta
-1.70
etz
-1.67
Faul
-1.60
YA
-1.60
DM
-1.60
oun
-1.59
gew
-1.55
oaded
-1.54
DN
-1.54
pled
-1.51
POSITIVE LOGITS
levard
1.96
hippocampus
1.92
thriller
1.82
trillion
1.81
nanop
1.74
Reuters
1.74
thereum
1.73
ripple
1.71
glitch
1.68
hran
1.67
Activations Density 0.003%