INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ATTR
    -0.07
     toxins
    -0.07
    67
    -0.07
    BILL
    -0.06
     SOUND
    -0.06
    ::::::::::::::::
    -0.06
    attr
    -0.06
     Bombay
    -0.06
    Higher
    -0.06
     fungi
    -0.06
    POSITIVE LOGITS
     lap
    0.16
     Lap
    0.12
     laps
    0.10
    lap
    0.09
     nap
    0.09
    อป
    0.08
    nap
    0.08
    lak
    0.08
    лас
    0.08
     Iowa
    0.08
    Act Density 0.004%

    No Known Activations