INDEX
    Explanations

    prepositional phrases indicating relationships or connections

    New Auto-Interp
    Negative Logits
    incial
    -0.89
    afia
    -0.75
    idences
    -0.73
    ederal
    -0.71
    cci
    -0.70
    merce
    -0.70
    ossibility
    -0.70
    anguage
    -0.70
    aren
    -0.69
    ignt
    -0.69
    POSITIVE LOGITS
     nowhere
    0.71
     Rowe
    0.67
     mole
    0.65
     Gillespie
    0.64
     Bris
    0.63
     Buster
    0.61
     existence
    0.60
     him
    0.59
     everything
    0.59
     them
    0.58
    Act Density 0.025%

    No Known Activations