INDEX
    Explanations

    sentence endings and separators

    New Auto-Interp
    Negative Logits
     उदा
    0.79
     eg
    0.70
    第一個
    0.69
     deleting
    0.68
    less
    0.68
     magic
    0.68
     volunteering
    0.68
     homework
    0.66
     big
    0.65
     equality
    0.65
    POSITIVE LOGITS
    etera
    1.23
    .),
    1.20
    .).
    1.17
    .;
    1.07
    .],
    1.01
    .?
    1.00
    .—
    0.99
    ."),
    0.97
    .');
    0.96
    .].
    0.95
    Act Density 0.114%

    No Known Activations