INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    158
    -0.06
     believable
    -0.06
    responses
    -0.06
     differentiate
    -0.06
     functionName
    -0.06
     shredded
    -0.06
     其他
    -0.06
     Injectable
    -0.06
     суп
    -0.06
    ozřejmě
    -0.06
    POSITIVE LOGITS
    lw
    0.08
     camp
    0.07
     Safari
    0.07
     Mandal
    0.07
    .JTable
    0.06
     tents
    0.06
    (EFFECT
    0.06
     picnic
    0.06
     arrived
    0.06
    μβ
    0.06
    Act Density 0.017%

    No Known Activations