INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    opi
    -0.07
                                          
    -0.07
     Bewert
    -0.06
    _chat
    -0.06
    _properties
    -0.06
     statue
    -0.06
     predator
    -0.06
    اگ
    -0.06
     организм
    -0.06
    abı
    -0.06
    POSITIVE LOGITS
    newInstance
    0.07
     Giáo
    0.07
    oil
    0.07
    avatel
    0.06
    PARTMENT
    0.06
    oir
    0.06
    ,Th
    0.06
     advise
    0.06
     DAL
    0.06
    Dal
    0.06
    Act Density 0.010%

    No Known Activations