INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    These
    0.48
    config
    0.48
    This
    0.46
    Anyone
    0.46
    It
    0.46
    Oct
    0.44
    新たな
    0.43
    0.42
    Performance
    0.42
    Cont
    0.42
    POSITIVE LOGITS
     timeCounter
    0.55
     nachos
    0.52
     होई
    0.49
     메뉴
    0.48
     MENU
    0.47
     pizzas
    0.46
     efter
    0.46
     MenuView
    0.46
     maken
    0.46
     waffles
    0.46
    Act Density 0.004%

    No Known Activations