INDEX
    Explanations

    phrases that emphasize or introduce lists or examples

    New Auto-Interp
    Negative Logits
    aki
    -0.16
    verte
    -0.15
    ola
    -0.15
    verter
    -0.15
    eba
    -0.14
    upertino
    -0.14
    kaar
    -0.14
    ZR
    -0.14
    using
    -0.14
    duct
    -0.14
    POSITIVE LOGITS
     instance
    0.25
    unately
    0.20
     example
    0.20
    cing
    0.20
    getting
    0.19
    ged
    0.18
    instance
    0.18
    ced
    0.17
    bid
    0.17
     decades
    0.17
    Act Density 0.063%

    No Known Activations