INDEX
    Explanations

    explaining things in detail

    New Auto-Interp
    Negative Logits
     অথবা
    0.58
     또는
    0.50
     અથવા
    0.46
     Surprisingly
    0.45
    または
    0.45
     সর্বপ্রথম
    0.41
     आश्चर्य
    0.41
     ወይም
    0.41
    但不
    0.41
     или
    0.40
    POSITIVE LOGITS
    毕竟
    0.81
     presumably
    0.77
    잖아요
    0.69
     inherently
    0.64
     Presumably
    0.63
    presumably
    0.62
     자체가
    0.59
     notoriously
    0.57
     essentially
    0.57
     justamente
    0.57
    Act Density 0.035%

    No Known Activations