INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    '
    1.13
    t
    0.89
    p
    0.86
    aad
    0.85
    f
    0.84
     can
    0.83
     abide
    0.82
     është
    0.82
    er
    0.79
    ença
    0.79
    POSITIVE LOGITS
     Eggs
    1.44
    🥚
    1.35
     eggs
    1.34
     Egg
    1.28
    eggs
    1.20
    鸡蛋
    1.09
     egg
    1.06
    Eggs
    1.06
    1.06
    Egg
    1.02
    Act Density 0.029%

    No Known Activations