INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Receive
    -0.07
     retorn
    -0.07
     bekom
    -0.07
     DEM
    -0.06
    _QUERY
    -0.06
     females
    -0.06
     ragazza
    -0.06
     spolup
    -0.06
     thang
    -0.06
     Homes
    -0.06
    POSITIVE LOGITS
    [^
    0.07
    ando
    0.07
    SELECT
    0.06
     -->↵↵↵
    0.06
    _processors
    0.06
    0.06
     вида
    0.06
    0.05
    ît
    0.05
    ایل
    0.05
    Act Density 0.005%

    No Known Activations