INDEX
    Explanations

    punctuation marks and symbols indicating emphasis or emotion

    New Auto-Interp
    Negative Logits
    </h6>
    -0.68
    ↵↵↵↵↵↵
    -0.62
    ↵↵↵↵↵
    -0.62
    ↵↵↵↵↵↵↵
    -0.59
    ↵↵↵
    -0.57
    ↵↵↵↵↵↵↵↵
    -0.57
    ↵↵↵↵
    -0.57
    expandindo
    -0.56
    ↵↵↵↵↵↵↵↵↵
    -0.54
    ↵↵
    -0.54
    POSITIVE LOGITS
    ."
    1.40
    。”
    1.36
    .”
    1.36
    ".
    1.35
    ").
    1.32
    ”.
    1.31
    1.30
    ”。
    1.30
    )."
    1.25
    "
    
    1.24
    Act Density 0.169%

    No Known Activations