INDEX
    Explanations

    a followed by adjective

    New Auto-Interp
    Negative Logits
    izm
    -0.11
    771
    -0.09
    icals
    -0.09
    RIES
    -0.09
     latent
    -0.09
     backbone
    -0.09
    ìĤ
    -0.09
     Seks
    -0.09
     heartbreaking
    -0.09
     Bernstein
    -0.09
    POSITIVE LOGITS
     relief
    0.13
     matter
    0.13
     struggle
    0.12
     turning
    0.11
     sight
    0.11
     exagger
    0.11
     isol
    0.11
     feeling
    0.10
     moment
    0.10
     toss
    0.10
    Act Density 0.039%

    No Known Activations