INDEX
    Explanations

    phrases indicating classification or categorization

    New Auto-Interp
    Negative Logits
     surla
    -0.45
    তথ্যসূত্র
    -0.40
     murale
    -0.39
     domés
    -0.39
    Manbalar
    -0.38
     aveug
    -0.38
    -0.34
    Ness
    -0.34
    Cosmetic
    -0.34
    AnchorStyles
    -0.34
    POSITIVE LOGITS
     sorta
    0.80
     Kinda
    0.77
     somewhat
    0.76
     kinda
    0.72
     Somewhat
    0.69
    Somewhat
    0.68
    Kinda
    0.68
    styleable
    0.66
    kinda
    0.64
    somewhat
    0.63
    Act Density 0.176%

    No Known Activations