INDEX
    Explanations

    references to political figures and their affiliations

    New Auto-Interp
    Negative Logits
     itſelf
    -1.18
     myſelf
    -1.10
    )"),
    -1.05
     Jefus
    -0.99
     pleaſure
    -0.98
     nakalista
    -0.95
     ſind
    -0.95
     ―――――
    -0.95
     ་་
    -0.94
     whoſe
    -0.94
    POSITIVE LOGITS
     or
    1.03
     et
    0.64
     if
    0.63
     without
    0.63
     would
    0.61
     for
    0.61
     just
    0.60
     might
    0.59
    ?
    0.58
     even
    0.57
    Act Density 0.278%

    No Known Activations