INDEX
    Explanations

    the repetition of the word "just."

    New Auto-Interp
    Negative Logits
    ught
    -0.17
    ody
    -0.17
    ?p
    -0.17
    بس
    -0.16
    akin
    -0.16
    agan
    -0.15
    UGHT
    -0.15
    razier
    -0.15
    ODY
    -0.14
    anic
    -0.14
    POSITIVE LOGITS
    ifications
    0.26
    ifiable
    0.25
    ifi
    0.25
    ifying
    0.23
    ifies
    0.22
    IFI
    0.22
    ification
    0.20
    iciary
    0.20
    ifiers
    0.19
    ifica
    0.18
    Act Density 0.054%

    No Known Activations