INDEX
    Explanations

    phrases related to societal issues and personal experiences

    New Auto-Interp
    Negative Logits
     overlook
    -1.02
     roundup
    -0.94
     inflic
    -0.92
     incompet
    -0.89
     drown
    -0.88
     arch
    -0.88
     qualified
    -0.86
     shelling
    -0.85
     adjud
    -0.85
     aven
    -0.84
    POSITIVE LOGITS
    They
    1.46
    We
    1.46
    Our
    1.36
    Everything
    1.34
    It
    1.34
    Sometimes
    1.34
    Too
    1.28
    There
    1.28
    When
    1.27
    What
    1.27
    Act Density 0.535%

    No Known Activations