INDEX
    Explanations

    verb phrases indicating actions or states of being

    New Auto-Interp
    Negative Logits
    iden
    -0.17
    uit
    -0.15
    eph
    -0.14
    èĢ
    -0.14
    _lazy
    -0.14
    arp
    -0.13
     Deniz
    -0.13
    RT
    -0.13
    olin
    -0.13
    pub
    -0.13
    POSITIVE LOGITS
     be
    0.22
    rades
    0.18
     have
    0.16
     avoir
    0.16
     having
    0.16
     been
    0.16
    .have
    0.15
     being
    0.15
    zych
    0.14
    меÑĤÑĮ
    0.14
    Act Density 0.076%

    No Known Activations