INDEX
    Explanations

    phrases related to prevention or obstruction

    phrases indicating prohibition or prevention

    New Auto-Interp
    Negative Logits
    aic
    -0.72
    ector
    -0.70
    rote
    -0.69
    wait
    -0.68
    elman
    -0.66
    arse
    -0.66
    abre
    -0.66
    olitical
    -0.64
    ety
    -0.63
    lyak
    -0.63
    POSITIVE LOGITS
     accessing
    1.50
     entering
    1.40
     reaching
    1.38
     harming
    1.38
     obtaining
    1.35
     interfering
    1.34
     achieving
    1.31
     completing
    1.30
     joining
    1.30
     gaining
    1.29
    Act Density 0.068%

    No Known Activations