INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    s
    -0.21
    umi
    -0.17
    Ùĩ
    -0.16
    aldi
    -0.15
    (éĩij
    -0.15
    vore
    -0.14
    "sync
    -0.14
    boa
    -0.14
    nia
    -0.14
    ned
    -0.14
    POSITIVE LOGITS
    grav
    0.15
    ivate
    0.15
    ìķķ
    0.14
    okud
    0.14
    edom
    0.14
    macros
    0.14
    eming
    0.13
    anie
    0.13
    opak
    0.13
    окÑĥ
    0.13
    Act Density 0.008%

    No Known Activations