INDEX
Explanations
phrases related to actions taken against illegal activities or measures put in place to protect specific groups or entities
references to "sanctuary" or related concepts
New Auto-Interp
Negative Logits
llor
-0.81
Nebula
-0.74
drivers
-0.68
ï¸ı
-0.67
ives
-0.65
drive
-0.65
destro
-0.65
REE
-0.64
llers
-0.64
urned
-0.63
POSITIVE LOGITS
ctuary
1.46
ction
1.10
ctions
0.98
San
0.97
itary
0.94
gha
0.92
itized
0.88
Francisco
0.88
terior
0.87
anton
0.87
Activations Density 0.018%