INDEX
    Explanations

    detailed or proper instructions

    New Auto-Interp
    Negative Logits
     vanwege
    0.46
     wodurch
    0.45
     thisobject
    0.42
     apprehensive
    0.42
     personenbez
    0.41
    ുമായി
    0.41
     volition
    0.41
     monologue
    0.40
     duomen
    0.40
    ponenten
    0.40
    POSITIVE LOGITS
     সঠিকভাবে
    0.61
    ちゃんと
    0.60
    每个
    0.57
     제대로
    0.56
     ভালোভাবে
    0.55
    及时
    0.54
     لكل
    0.54
     truly
    0.54
    真实的
    0.54
    きちんと
    0.54
    Act Density 0.060%

    No Known Activations