INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
     etti
    -0.84
    -0.83
     so
    -0.83
     as
    -0.82
     instead
    -0.81
     als
    -0.80
     in
    -0.80
    -0.80
    -0.79
     even
    -0.78
    POSITIVE LOGITS
    <bos>
    10.82
     encomp
    3.77
     intersper
    3.65
     increa
    3.59
     affor
    3.57
     guarante
    3.56
     fuf
    3.51
     maneu
    3.49
     perfet
    3.42
     uninten
    3.42
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.