INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     đều
    -0.06
    іти
    -0.06
     шляхом
    -0.06
    con
    -0.06
    icultural
    -0.06
    ندر
    -0.06
     portfolio
    -0.06
    _ob
    -0.06
     initiation
    -0.06
     hosts
    -0.06
    POSITIVE LOGITS
    TASK
    0.08
     Fired
    0.07
    ACTION
    0.07
     Dipl
    0.07
     HV
    0.07
    rip
    0.07
     FAG
    0.06
    0.06
    Validation
    0.06
    .ms
    0.06
    Act Density 0.015%

    No Known Activations