INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     unterstüt
    -0.07
     stunt
    -0.06
    
    -0.06
    -0.06
     estate
    -0.06
    -0.06
    $sub
    -0.06
    nge
    -0.06
    measure
    -0.06
    ufe
    -0.06
    POSITIVE LOGITS
    friend
    0.08
    อห
    0.06
    [href
    0.06
     Whoever
    0.06
    аб
    0.06
     whoever
    0.06
    ikler
    0.06
    0.06
    .subscriptions
    0.06
    0.06
    Act Density 0.004%

    No Known Activations