INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    uran
    -0.07
    (IB
    -0.06
     увели
    -0.06
     जनवर
    -0.06
     emotion
    -0.06
    어서
    -0.06
     возможно
    -0.06
     chương
    -0.06
    звичай
    -0.06
    ro
    -0.06
    POSITIVE LOGITS
     spans
    0.07
     clim
    0.07
     struck
    0.07
    ophon
    0.06
     firm
    0.06
     aşağı
    0.06
     Pickup
    0.06
     harmed
    0.06
     Sense
    0.06
     scene
    0.06
    Act Density 0.000%

    No Known Activations