INDEX
    Explanations

    references to individuals or groups

    New Auto-Interp
    Negative Logits
    idor
    -0.18
    bilt
    -0.16
    atik
    -0.15
    éϵ
    -0.15
    åĬ±
    -0.14
    ?action
    -0.14
    baz
    -0.14
    undi
    -0.14
    .getID
    -0.14
    оÑĤоÑĢ
    -0.14
    POSITIVE LOGITS
     who
    0.23
     involved
    0.19
    who
    0.18
     whom
    0.17
     responsible
    0.17
     helm
    0.17
    het
    0.16
     Who
    0.15
     joining
    0.14
    elper
    0.14
    Act Density 0.276%

    No Known Activations