INDEX
    Explanations

    terms associated with mild conditions or effects

    New Auto-Interp
    Negative Logits
    ^(@)
    -1.30
     purpoſe
    -1.29
     ſtate
    -1.28
     itſelf
    -1.28
    Personendaten
    -1.27
     houſe
    -1.26
    ſelves
    -1.21
     myſelf
    -1.20
     whoſe
    -1.18
    ſelf
    -1.18
    POSITIVE LOGITS
    *(
    0.96
    0.79
    *
    0.67
    ↵↵
    0.65
    '
    0.65
    **
    0.64
    frac
    0.63
     =
    0.59
    .
    0.58
     and
    0.58
    Act Density 0.624%

    No Known Activations