INDEX
    Explanations

    phrases introducing a particular point or argument

    the word "So" used to introduce explanations or conclusions

    New Auto-Interp
    Negative Logits
    saf
    -0.66
    thro
    -0.63
     ``(
    -0.61
     nic
    -0.59
    ¢
    -0.58
     Halls
    -0.56
    âĹ¼
    -0.55
     Purg
    -0.55
     [[
    -0.54
    ski
    -0.54
    POSITIVE LOGITS
    oner
    1.25
    bered
    1.01
    fter
    0.99
    FTWARE
    0.98
    apy
    0.95
    oths
    0.91
    othes
    0.90
    ooo
    0.90
    aps
    0.86
    aked
    0.84
    Act Density 0.061%

    No Known Activations