INDEX
    Explanations

    concepts related to belonging and membership within communities or groups

    New Auto-Interp
    Negative Logits
    raf
    -0.17
    buz
    -0.14
     Lifestyle
    -0.14
    omu
    -0.14
     arrang
    -0.14
    ofs
    -0.14
    riba
    -0.13
    rico
    -0.13
    uky
    -0.13
    Ãło
    -0.13
    POSITIVE LOGITS
     belong
    0.66
     belongs
    0.66
     belonged
    0.63
    belongs
    0.58
    bel
    0.56
     Bel
    0.55
     pert
    0.54
     належ
    0.51
     gehört
    0.48
    å±ŀäºİ
    0.48
    Act Density 0.126%

    No Known Activations