INDEX
    Explanations

    phrases related to social behavior and interactions among people

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.02
    2:0.21
    3:0.27
    4:0.05
    5:0.04
    6:0.06
    7:0.05
    8:0.04
    9:0.07
    10:0.08
    11:0.04
    Negative Logits
     backing
    -1.46
    berus
    -1.41
     Schne
    -1.34
     Bened
    -1.33
    rium
    -1.33
    bered
    -1.23
    atever
    -1.20
    pora
    -1.19
     formation
    -1.19
     Marino
    -1.19
    POSITIVE LOGITS
    "]=>
    2.53
     ·
    1.83
     sqor
    1.80
    embed
    1.74
    avid
    1.68
    aden
    1.66
    ヘラ
    1.62
     david
    1.60
    ディ
    1.55
     Posted
    1.51
    Act Density 0.025%

    No Known Activations