INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    foundland
    -0.69
     accuser
    -0.68
     elbows
    -0.67
     Maz
    -0.64
    !--
    -0.62
    ORN
    -0.60
     smoot
    -0.60
    0100
    -0.60
     linguistic
    -0.60
     opaque
    -0.60
    POSITIVE LOGITS
    enegger
    1.01
    hog
    0.89
    sworth
    0.80
     <[
    0.75
    icho
    0.72
    hiba
    0.71
    phies
    0.70
    cong
    0.69
    isy
    0.68
    itz
    0.68
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.