INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    aurant
    -0.07
    guards
    -0.07
     pans
    -0.06
    Cha
    -0.06
    clinical
    -0.06
    rah
    -0.06
     careful
    -0.06
     grapes
    -0.06
    جان
    -0.06
    fan
    -0.06
    POSITIVE LOGITS
    ardo
    0.06
    ğını
    0.06
     Butler
    0.06
    ++++++++++++++++++++++++++++++++
    0.06
    0.06
    gang
    0.06
     eBay
    0.06
     founders
    0.06
     ((*
    0.06
    0.06
    Act Density 0.000%

    No Known Activations