INDEX
    Explanations

    Korean and Japanese conversational fillers

    New Auto-Interp
    Negative Logits
    旨在
    0.91
    および
    0.77
     terdapat
    0.76
     thereby
    0.74
     exertion
    0.74
    ؛
    0.74
    并在
    0.73
    함으로써
    0.73
     ως
    0.72
     вследствие
    0.72
    POSITIVE LOGITS
     너무
    1.32
     정말
    1.30
     요즘
    1.27
     많이
    1.16
     저는
    1.14
     avevo
    1.13
    私は
    1.13
    じゃない
    1.12
    だけど
    1.10
    너무
    1.10
    Act Density 0.001%

    No Known Activations