INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    On
    -3.42
    {
    -3.39
    After
    -3.27
    When
    -3.19
    Our
    -3.14
    While
    -3.11
    The
    -2.97
    During
    -2.94
    What
    -2.92
    There
    -2.91
    POSITIVE LOGITS
    2.61
    2.38
    ophageal
    2.38
    2.31
    ſſi
    2.31
    2.31
    2.27
    2.25
    2.25
    2.23
    Act Density 0.001%

    No Known Activations