INDEX
    Explanations

    end of sentence questions

    New Auto-Interp
    Negative Logits
     aquelas
    0.45
     aquela
    0.44
     aquel
    0.43
     aqueles
    0.43
     aquellas
    0.42
     aquellos
    0.41
    0.41
    那样
    0.40
     disclosures
    0.40
     aquele
    0.39
    POSITIVE LOGITS
    怎么办
    0.73
    HELP
    0.72
     HELP
    0.71
     Help
    0.69
    help
    0.64
     help
    0.63
     Tried
    0.62
     Worse
    0.59
    tried
    0.59
     tried
    0.59
    Act Density 0.009%

    No Known Activations