INDEX
    Explanations

    allows or helps actions

    New Auto-Interp
    Negative Logits
     chod
    0.38
     وعد
    0.37
     pet
    0.36
     mandate
    0.36
     müs
    0.36
    ونم
    0.36
    níků
    0.35
     hade
    0.35
     headache
    0.35
    าด
    0.35
    POSITIVE LOGITS
     helps
    1.70
     помогает
    1.67
     Helps
    1.50
    helps
    1.48
     يساعد
    1.38
     aiuta
    1.37
     помогают
    1.34
     giúp
    1.30
     обеспечивает
    1.29
     membantu
    1.28
    Act Density 0.026%

    No Known Activations