INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    شه
    -0.07
     sat
    -0.07
    chat
    -0.07
     hacked
    -0.07
     marché
    -0.07
    sets
    -0.06
    -0.06
    Fecha
    -0.06
    	process
    -0.06
     tran
    -0.06
    POSITIVE LOGITS
     only
    0.23
     Only
    0.19
    Only
    0.18
    only
    0.18
     ONLY
    0.17
    ONLY
    0.12
    -only
    0.12
     sólo
    0.11
    .only
    0.10
     chỉ
    0.09
    Act Density 0.078%

    No Known Activations