INDEX
    Explanations

    dialogue or asking questions

    New Auto-Interp
    Negative Logits
    -1.27
     souvent
    -1.17
     quelquefois
    -1.14
    Often
    -1.05
     fidé
    -1.05
    点了点头
    -1.04
     Rarely
    -1.02
     rigue
    -1.02
    totta
    -1.02
    往往
    -1.02
    POSITIVE LOGITS
     ask
    1.88
     asks
    1.80
     tells
    1.68
     tell
    1.67
     explain
    1.66
     wonder
    1.54
     suggest
    1.45
     mention
    1.42
     remark
    1.38
     assure
    1.36
    Act Density 0.019%

    No Known Activations