INDEX
    Explanations

    mention of specific locations or events

    New Auto-Interp
    Negative Logits
     What
    -0.66
    .
    -0.64
     There
    -0.64
     It
    -0.62
    !
    -0.62
     That
    -0.61
     How
    -0.61
     The
    -0.60
     Not
    -0.60
    What
    -0.59
    POSITIVE LOGITS
     ftu
    1.65
     fta
    1.64
     swarovski
    1.52
     ricardo
    1.50
     jorge
    1.47
     Juf
    1.46
     dises
    1.46
     sergio
    1.46
     fup
    1.45
     doman
    1.43
    Act Density 0.660%

    No Known Activations