INDEX
    Explanations

    words related to actions of exploding or collapsing

    words related to implication and suggesting connections or consequences

    New Auto-Interp
    Negative Logits
    chal
    -0.74
    bial
    -0.67
    flix
    -0.67
     passer
    -0.66
    lette
    -0.61
    ryu
    -0.61
    liness
    -0.60
     cleaners
    -0.60
    BOOK
    -0.60
    kees
    -0.59
    POSITIVE LOGITS
    osion
    1.61
    oded
    1.53
    ausible
    1.53
    icating
    1.43
    oding
    1.36
    icate
    1.35
    icates
    1.26
    anting
    1.23
    ications
    1.23
    odes
    1.14
    Act Density 0.019%

    No Known Activations