INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Animal
    -0.07
    Dark
    -0.07
    лых
    -0.07
    /all
    -0.07
    inema
    -0.06
     animals
    -0.06
     Asia
    -0.06
     animal
    -0.06
    _MULT
    -0.06
    ERC
    -0.06
    POSITIVE LOGITS
     champagne
    0.14
     Champagne
    0.11
     tink
    0.07
     ribbon
    0.07
    oolStrip
    0.07
    919
    0.06
    0.06
    -toggle
    0.06
     edip
    0.06
     Merr
    0.06
    Act Density 0.002%

    No Known Activations