INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    unguza
    -0.08
     ceramics
    -0.08
     hala
    -0.08
    Ah
    -0.08
    lighet
    -0.08
     طراحی
    -0.08
    还是
    -0.08
     Ah
    -0.08
     همچ
    -0.07
    -0.07
    POSITIVE LOGITS
    stash
    0.08
     venced
    0.08
    straight
    0.08
    完整
    0.07
    ting
    0.07
    .Special
    0.07
     straight
    0.07
     stacking
    0.07
     semej
    0.07
     stacked
    0.07
    Act Density 0.013%

    No Known Activations