INDEX
    Explanations

    Positive descriptions

    New Auto-Interp
    Negative Logits
    Material
    -0.07
    Face
    -0.06
     Caption
    -0.06
     Message
    -0.06
     GAS
    -0.06
     Formal
    -0.06
    .categories
    -0.06
    riteria
    -0.06
    gt
    -0.06
    ubes
    -0.06
    POSITIVE LOGITS
     disen
    0.07
     urlString
    0.07
     Giấy
    0.07
     م
    0.07
    (CONT
    0.06
     forg
    0.06
     vanity
    0.06
     Anch
    0.06
     mož
    0.06
    ])))
    0.06
    Act Density 0.167%

    No Known Activations