INDEX
Explanations
references to specific places and incidents related to crime or disturbances
New Auto-Interp
Negative Logits
cheon
-0.16
sterdam
-0.15
LBL
-0.15
主人
-0.14
moyen
-0.14
orial
-0.14
abic
-0.14
@Module
-0.14
izens
-0.14
uhe
-0.14
POSITIVE LOGITS
sol
0.22
Sol
0.20
Sol
0.19
Doc
0.18
SOL
0.17
fac
0.17
.sol
0.16
doc
0.16
.Doc
0.16
Doc
0.16
Activations Density 0.002%