INDEX
    Explanations

    purpose is to be helpful

    New Auto-Interp
    Negative Logits
     повто
    0.64
    再次
    0.63
     Source
    0.63
    արդ
    0.62
     Drama
    0.62
    😨
    0.61
     drama
    0.61
    ダブル
    0.60
     Performing
    0.60
     زبان
    0.59
    POSITIVE LOGITS
     ajudar
    0.85
    ebo
    0.80
     volunte
    0.77
     volunteered
    0.77
     设计
    0.76
     servent
    0.76
    0.76
     help
    0.75
     servidor
    0.75
     proteger
    0.75
    Act Density 0.026%

    No Known Activations