INDEX
    Explanations

    words related to change or transformation

    significant actions or processes related to change or transformation

    New Auto-Interp
    Negative Logits
    Its
    -0.67
     Bah
    -0.65
    Cub
    -0.65
     Its
    -0.64
    rican
    -0.59
     believes
    -0.58
     Bron
    -0.58
     Diane
    -0.57
    atform
    -0.57
    Watch
    -0.56
    POSITIVE LOGITS
     themselves
    1.56
     prolifer
    1.02
     their
    0.97
     individually
    0.94
    selves
    0.94
    their
    0.89
     respectively
    0.88
     respective
    0.87
     counterparts
    0.87
     varying
    0.84
    Act Density 0.656%

    No Known Activations