INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     grabbed
    -0.06
    され
    -0.06
    gif
    -0.06
    communication
    -0.06
     Arist
    -0.06
    ächst
    -0.06
     overwhelmed
    -0.06
     الولايات
    -0.06
     Frames
    -0.06
    Germany
    -0.06
    POSITIVE LOGITS
     kite
    0.06
     rainfall
    0.06
     натураль
    0.06
     عرضه
    0.06
     paní
    0.06
    0.06
     cube
    0.06
    .apply
    0.06
     cq
    0.06
    enské
    0.06
    Act Density 0.085%

    No Known Activations