INDEX
    Explanations

    phrases related to surveillance and control

    New Auto-Interp
    Negative Logits
     reluct
    -1.52
     guarante
    -1.43
     encomp
    -1.42
     impractica
    -1.42
     unlaw
    -1.41
     shenan
    -1.41
     affor
    -1.40
     indor
    -1.39
     increa
    -1.39
     resear
    -1.38
    POSITIVE LOGITS
     his
    1.10
     their
    1.06
    <bos>
    1.01
    their
    0.95
    his
    0.94
     its
    0.93
     her
    0.90
     your
    0.83
     suas
    0.83
     seu
    0.82
    Act Density 0.668%

    No Known Activations