INDEX
    Explanations

    words indicating conclusion or termination

    New Auto-Interp
    Negative Logits
    EnableWeb
    -0.62
     daarvan
    -0.59
     ransom
    -0.58
    MLLoader
    -0.58
    retweeted
    -0.57
    iented
    -0.56
    braio
    -0.55
    よいよ
    -0.55
    <bos>
    -0.54
    لينكات
    -0.53
    POSITIVE LOGITS
     ended
    1.20
     ends
    1.08
     ending
    1.06
     Ended
    0.93
    结束
    0.92
     Ending
    0.92
    Ended
    0.91
     terminated
    0.91
     Ends
    0.89
     stopped
    0.83
    Act Density 0.144%

    No Known Activations