INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     equil
    -0.08
    Phy
    -0.08
    Ribbon
    -0.08
    евар
    -0.07
    Indeed
    -0.07
    Board
    -0.07
    kové
    -0.07
     தெரிய
    -0.07
    ёл
    -0.07
    Unlock
    -0.07
    POSITIVE LOGITS
     gradually
    0.09
     progressively
    0.08
     ques
    0.08
     vague
    0.08
     tom
    0.08
     ambiguous
    0.07
    bbbb
    0.07
    604
    0.07
    Ì
    0.07
     pace
    0.07
    Act Density 0.012%

    No Known Activations