INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     on
    -1.45
     in
    -1.31
     just
    -1.14
     because
    -1.08
     when
    -1.07
     with
    -1.07
     any
    -1.07
     what
    -1.05
     When
    -1.03
     seemingly
    -0.99
    POSITIVE LOGITS
     HARVARD
    1.11
     ingenu
    1.07
     Dinas
    1.06
     ranting
    1.06
     Hygge
    1.05
     accla
    1.05
    ulfon
    1.05
     NEGATIVE
    1.05
     ENVIRONMENTAL
    1.02
     incontro
    1.02
    Act Density 0.025%

    No Known Activations