INDEX
    Explanations

    phrases related to removal or significant change

    New Auto-Interp
    Negative Logits
    phia
    -0.17
     zel
    -0.15
    agal
    -0.15
    -format
    -0.15
    argas
    -0.14
    format
    -0.14
    ubu
    -0.14
    erve
    -0.14
     format
    -0.14
    Format
    -0.14
    POSITIVE LOGITS
    PLIED
    0.17
    ourn
    0.16
    بار
    0.14
     stocks
    0.14
    weis
    0.14
    ska
    0.14
     Crest
    0.14
     esc
    0.14
     stock
    0.13
    íķ´ë³´
    0.13
    Act Density 0.002%

    No Known Activations