INDEX
    Explanations

    a mix of introductory phrases and expressions of time or conditions

    New Auto-Interp
    Negative Logits
    .
    -0.58
    ;
    -0.52
    -0.51
    fsp
    -0.47
    <bos>
    -0.47
    ?
    -0.46
    LookAnd
    -0.45
     stanno
    -0.45
    :
    -0.45
    ::::::::
    -0.42
    POSITIVE LOGITS
     itſelf
    0.93
     {},
    
    0.84
     fometimes
    0.83
     doubtnut
    0.82
     ſmall
    0.80
     poffible
    0.79
     ſeveral
    0.79
     myſelf
    0.78
     [],
    
    0.77
     $_"
    0.77
    Act Density 0.308%

    No Known Activations