INDEX
Explanations
mentions of police or government operations where they search or seize items
terms related to police or military actions, particularly raids
New Auto-Interp
Negative Logits
gran
-0.71
Alps
-0.71
pure
-0.69
present
-0.66
faith
-0.65
bearing
-0.65
guy
-0.64
peer
-0.64
å¦
-0.64
uber
-0.64
POSITIVE LOGITS
raided
0.97
ãĥĥãĥĪ
0.93
ccoli
0.85
ĵĺ
0.78
ãĤ¤ãĥĪ
0.76
Ń·
0.76
apons
0.75
ashtra
0.75
ategory
0.73
ataka
0.70
Activations Density 0.011%