INDEX
Explanations
words related to surveillance, politics, and urban environments
New Auto-Interp
Negative Logits
aceutical
-0.76
Constructed
-0.70
enegger
-0.69
UTION
-0.64
ersion
-0.63
ometers
-0.63
uality
-0.63
70710
-0.63
ãĥ¼ãĥ³
-0.62
8000
-0.61
POSITIVE LOGITS
reth
1.07
lette
1.05
vernment
1.02
rier
0.99
hou
0.98
rer
0.93
bles
0.92
levard
0.92
pees
0.92
rown
0.90
Activations Density 0.794%