INDEX
    Explanations

    Titles and names

    New Auto-Interp
    Negative Logits
     Verification
    -0.07
     Depression
    -0.06
     scores
    -0.06
    овий
    -0.06
    OVID
    -0.06
     depression
    -0.06
     Tian
    -0.06
    .Pop
    -0.06
    За
    -0.06
    ати
    -0.06
    POSITIVE LOGITS
     controle
    0.07
    00
    0.07
     kot
    0.06
     comparer
    0.06
     kup
    0.06
    !!,
    0.06
    `()
    0.06
    0.06
    -ln
    0.06
    git
    0.06
    Act Density 0.074%

    No Known Activations