INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ategorie
    -0.07
    Str
    -0.07
    res
    -0.07
     Integer
    -0.07
    obre
    -0.06
    .resource
    -0.06
     forces
    -0.06
     bytes
    -0.06
    rand
    -0.06
    _words
    -0.06
    POSITIVE LOGITS
     signup
    0.07
     Newman
    0.07
     estudio
    0.07
     пару
    0.07
     Paula
    0.07
     popup
    0.07
    setup
    0.06
     HQ
    0.06
     setup
    0.06
     Duffy
    0.06
    Act Density 0.013%

    No Known Activations