INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     actually
    0.55
    まさに
    0.54
     eigent
    0.52
     basically
    0.52
     Dl
    0.50
     simplesmente
    0.50
    <h6>
    0.49
    Basically
    0.49
    И
    0.48
    Л
    0.48
    POSITIVE LOGITS
    至少
    0.71
    পক্ষে
    0.59
    少なくとも
    0.56
     almeno
    0.55
     zumindest
    0.54
     atleast
    0.54
    ES
    0.52
    ED
    0.52
     অন্তত
    0.51
    0.51
    Act Density 0.011%

    No Known Activations