INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     barber
    -0.08
    seal
    -0.08
     operations
    -0.08
     operações
    -0.07
    Operations
    -0.07
    bus
    -0.07
     이미지
    -0.07
     barre
    -0.07
     imagery
    -0.07
    onden
    -0.07
    POSITIVE LOGITS
     desperately
    0.08
     disastr
    0.08
     reluctantly
    0.08
    0.08
     soprattutto
    0.08
     Hot
    0.08
     annoying
    0.08
     cavity
    0.08
     سلم
    0.08
     ایم
    0.07
    Act Density 0.002%

    No Known Activations