INDEX
    Explanations

    mentions of style or styling-related terms

    New Auto-Interp
    Negative Logits
    itor
    -0.17
    ksam
    -0.16
    iter
    -0.16
     Pazar
    -0.16
    markt
    -0.16
    çº
    -0.15
    ITER
    -0.15
    imary
    -0.15
    ugin
    -0.14
    stable
    -0.14
    POSITIVE LOGITS
    rene
    0.24
    lish
    0.23
    gia
    0.20
     Sty
    0.19
    lists
    0.18
    list
    0.17
     styl
    0.17
    warts
    0.17
     sty
    0.16
    wart
    0.16
    Act Density 0.005%

    No Known Activations