INDEX
    Explanations

    pronouns indicating personal relationships or interactions

    New Auto-Interp
    Head Attr Weights
    0:0.07
    1:0.04
    2:0.10
    3:0.08
    4:0.07
    5:0.13
    6:0.04
    7:0.05
    8:0.20
    9:0.06
    10:0.05
    11:0.05
    Negative Logits
     ancest
    -1.75
    alach
    -1.75
     challeng
    -1.71
    pta
    -1.61
    atown
    -1.60
    aine
    -1.56
    hedon
    -1.53
    odan
    -1.52
    agine
    -1.51
    puter
    -1.51
    POSITIVE LOGITS
    invoke
    1.53
    ía
    1.50
    aeus
    1.47
    ERY
    1.45
     disconnected
    1.43
    ICES
    1.43
     CentOS
    1.42
    REC
    1.42
    toggle
    1.41
    VICE
    1.41
    Act Density 0.000%

    No Known Activations