INDEX
    Explanations

    positive adjectives

    New Auto-Interp
    Negative Logits
     जन्म
    -0.08
     Major
    -0.08
     vindt
    -0.08
    Largest
    -0.08
     Largest
    -0.07
    -0.07
     जाग
    -0.07
     Pricing
    -0.07
     विचार
    -0.07
     finds
    -0.07
    POSITIVE LOGITS
     sva
    0.09
    CTR
    0.08
    -green
    0.08
     grub
    0.08
     FPGA
    0.08
    CTRL
    0.08
    Fox
    0.08
     belli
    0.08
    ATAR
    0.08
    auts
    0.08
    Act Density 0.052%

    No Known Activations