INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    /mark
    -0.07
    .activities
    -0.07
     زیب
    -0.07
    ̃
    -0.07
    perience
    -0.06
    boot
    -0.06
    [tag
    -0.06
     def
    -0.06
    -0.06
    modify
    -0.06
    POSITIVE LOGITS
     Eat
    0.11
     eat
    0.10
     eating
    0.09
    Eat
    0.07
     eats
    0.07
     eaten
    0.07
     Eating
    0.07
     Eaton
    0.07
     ecological
    0.06
     Read
    0.06
    Act Density 0.018%

    No Known Activations