INDEX
    Explanations

    I've included / I put together

    New Auto-Interp
    Negative Logits
    presumably
    0.40
     phải
    0.36
    previously
    0.36
     hitherto
    0.35
     Harus
    0.34
    Previously
    0.34
     heretofore
    0.33
     bukanlah
    0.33
     harus
    0.33
    する必要
    0.33
    POSITIVE LOGITS
     gave
    0.66
     wrote
    0.60
     took
    0.56
    wrote
    0.55
     included
    0.54
     telah
    0.54
     suggested
    0.53
     Included
    0.53
     đã
    0.50
     created
    0.50
    Act Density 0.001%

    No Known Activations