INDEX
    Explanations

    expressing strong liking

    New Auto-Interp
    Negative Logits
     далее
    0.39
     совместно
    0.37
     সমূহ
    0.36
    飲食
    0.34
    प्रभाव
    0.34
     }^{*
    0.33
     Importance
    0.33
    purchase
    0.32
    业绩
    0.32
    Why
    0.32
    POSITIVE LOGITS
     seeing
    0.82
     hearing
    0.82
     having
    0.78
     watching
    0.71
     being
    0.70
     spending
    0.67
     getting
    0.65
     receiving
    0.61
     playing
    0.61
     interacting
    0.60
    Act Density 0.036%

    No Known Activations