INDEX
    Explanations

    expressions of support or encouragement

    New Auto-Interp
    Negative Logits
     therefore
    -0.31
     Therefore
    -0.29
    Therefore
    -0.27
     поÑįÑĤомÑĥ
    -0.24
     wiÄĻc
    -0.24
     hence
    -0.24
     nên
    -0.23
    ï¼ĮæīĢ以
    -0.22
    æīĢ以
    -0.21
     Hence
    -0.21
    POSITIVE LOGITS
     otherwise
    0.21
    otherwise
    0.19
     chances
    0.19
    æ¯ķ
    0.17
     Otherwise
    0.16
     Doing
    0.16
     Ù쨥ÙĨ
    0.15
    ometr
    0.15
    νή
    0.15
    Otherwise
    0.15
    Act Density 0.284%

    No Known Activations