INDEX
Explanations
phrases related to legal and law enforcement activities
instances of violence or physical altercations
New Auto-Interp
Negative Logits
domestically
-0.78
curated
-0.73
sustainability
-0.72
marquee
-0.70
fandom
-0.70
innov
-0.69
inclusion
-0.69
neb
-0.67
carrot
-0.67
aligned
-0.67
POSITIVE LOGITS
Later
1.42
Eventually
1.40
Then
1.39
Upon
1.32
Investigators
1.32
Attempts
1.29
However
1.23
Police
1.22
Meanwhile
1.22
According
1.21
Activations Density 0.381%