INDEX
    Explanations

    abstract nouns followed by punctuation

    New Auto-Interp
    Negative Logits
     hoặc
    0.51
    と思いますが
    0.48
     ຫຼື
    0.48
    かもしれませんが
    0.45
    /
    0.45
     (
    0.45
     ή
    0.45
     veya
    0.45
    或其他
    0.43
    했고
    0.42
    POSITIVE LOGITS
    0.77
    .”
    0.64
    。”
    0.63
    ."
    0.61
    .</
    0.60
    .\\
    0.57
    。」
    0.54
    .")
    0.51
    .~
    0.49
    .​
    0.49
    Act Density 0.207%

    No Known Activations