INDEX
    Explanations

    references to personal connections and community engagement

    New Auto-Interp
    Negative Logits
    ivor
    -0.18
     Insider
    -0.16
    aks
    -0.16
    jal
    -0.15
    -ahead
    -0.15
    ziel
    -0.14
    dar
    -0.14
    okit
    -0.14
    ritch
    -0.14
    pri
    -0.14
    POSITIVE LOGITS
     into
    0.28
     closer
    0.28
     together
    0.24
     alive
    0.23
     Clo
    0.21
    Into
    0.21
    into
    0.21
    alive
    0.21
     Into
    0.21
     back
    0.21
    Act Density 0.032%

    No Known Activations