INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    occ
    -0.08
     Challenger
    -0.08
     простран
    -0.08
     gør
    -0.07
    ussa
    -0.07
    ulsion
    -0.07
     pathogens
    -0.07
     जाते
    -0.07
    ichtigkeit
    -0.07
     microbes
    -0.07
    POSITIVE LOGITS
     Choices
    0.09
     роман
    0.09
    (Scene
    0.09
     Vid
    0.08
     escenas
    0.08
     Nus
    0.08
     Kar
    0.08
     sezon
    0.08
     beig
    0.08
     romance
    0.08
    Act Density 0.008%

    No Known Activations