INDEX
    Explanations

    asking for help or information

    New Auto-Interp
    Negative Logits
     しっかり
    0.38
    しっかり
    0.36
    жет
    0.36
     неравен
    0.35
     seizing
    0.35
    attacking
    0.35
     الحره
    0.34
    用心
    0.34
     नक्की
    0.34
    0.34
    POSITIVE LOGITS
     assistance
    1.38
     help
    1.23
     aiuto
    1.19
    assistance
    1.11
     hjälp
    1.03
     помощь
    1.02
     bantuan
    1.02
     Hilfe
    1.00
     Assistance
    0.98
     помощи
    0.98
    Act Density 0.026%

    No Known Activations