INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
     ðŁ
    -0.17
    392
    -0.15
     emoji
    -0.15
    abra
    -0.15
    ðŁ
    -0.14
     ingr
    -0.14
    uez
    -0.14
     industries
    -0.14
     ðŁij
    -0.13
     Emoji
    -0.13
    POSITIVE LOGITS
     Small
    0.18
     micro
    0.18
    Individual
    0.17
    .micro
    0.17
     Individual
    0.16
    Small
    0.16
    idges
    0.15
    _micro
    0.15
     network
    0.15
    /small
    0.15
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.