INDEX
Explanations
names of locations and titles with a focus on criminal activities
proper nouns and significant entities within the text
New Auto-Interp
Negative Logits
ALS
-0.88
Unch
-0.88
acknow
-0.88
Pes
-0.84
IB
-0.79
MIS
-0.79
IPM
-0.78
estab
-0.78
thous
-0.76
Antar
-0.75
POSITIVE LOGITS
shop
1.39
organ
1.39
run
1.39
floor
1.38
part
1.37
move
1.37
do
1.36
free
1.35
max
1.34
mist
1.33
Activations Density 0.389%