INDEX
    Explanations

    occurrences of the letter "y" in various contexts

    New Auto-Interp
    Negative Logits
    <eos>
    -0.64
    ↵↵
    -0.62
    .
    -0.61
    -0.55
    __":
    -0.51
     “
    -0.49
    ,
    -0.49
    ...
    -0.49
    DDE
    -0.48
     "
    -0.47
    POSITIVE LOGITS
    y
    1.97
     y
    1.45
    Y
    1.32
    𝑦
    1.02
    𝐲
    0.99
     Monfieur
    0.98
    𝙮
    0.97
    0.96
     Majefty
    0.95
    𝚢
    0.95
    Act Density 0.207%

    No Known Activations