INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     cairan
    0.47
     Faith
    0.46
    Faith
    0.45
     faith
    0.43
     substring
    0.42
     ovarian
    0.42
    Floral
    0.42
     subtree
    0.40
     Stir
    0.40
     Iterable
    0.39
    POSITIVE LOGITS
     burger
    1.38
     burgers
    1.33
    🍔
    1.26
     Burger
    1.23
    Burger
    1.20
     hamburger
    1.17
     patty
    1.08
    burger
    1.07
     Burgers
    1.07
     hamburgers
    1.05
    Act Density 0.034%

    No Known Activations