INDEX
Explanations
references to surveillance and control within societal contexts
New Auto-Interp
Negative Logits
Huff
-0.16
-Token
-0.15
rane
-0.15
åĤ¬
-0.14
ixin
-0.14
ovie
-0.14
-theme
-0.14
ypse
-0.14
æŃ
-0.14
Pou
-0.14
POSITIVE LOGITS
ample
0.18
.communication
0.14
topl
0.14
Gregg
0.14
ìĬĪ
0.13
ogo
0.13
acro
0.13
imal
0.13
itious
0.13
ED
0.13
Activations Density 0.061%