INDEX
    Explanations

    empty quotation marks and syntax-related characters

    New Auto-Interp
    Negative Logits
     queſta
    -1.24
    niſſe
    -1.21
    <pad>
    -1.20
    [@BOS@]
    -1.20
    <unused68>
    -1.20
    iſchen
    -1.20
    <unused43>
    -1.20
    <unused41>
    -1.20
    <unused14>
    -1.20
    <unused28>
    -1.20
    POSITIVE LOGITS
    <eos>
    0.49
    0.48
    0.40
    ↵↵
    0.40
    1
    0.37
    .
    0.35
    2
    0.31
    I
    0.30
    ↵↵↵
    0.30
      
    0.30
    Act Density 0.000%

    No Known Activations