INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    s
    0.86
    ion
    0.80
    nd
    0.73
    maßen
    0.73
     gef
    0.70
    lig
    0.66
     инструмент
    0.66
     различни
    0.65
    law
    0.64
     метал
    0.63
    POSITIVE LOGITS
    د
    0.74
     as
    0.73
     vanilla
    0.71
    е
    0.71
    EN
    0.70
     scrut
    0.68
    یه
    0.66
    as
    0.65
    ‌تر
    0.65
    ه‌
    0.64
    Act Density 0.004%

    No Known Activations