INDEX
    Explanations

    personal pronouns indicating actions or emotions directed towards oneself or others

    New Auto-Interp
    Negative Logits
     Associated
    -0.68
    quartered
    -0.66
     Rousse
    -0.66
     Electrical
    -0.63
     Globe
    -0.62
    laughter
    -0.61
     Canaver
    -0.61
     understatement
    -0.61
     Stra
    -0.61
    holding
    -0.60
    POSITIVE LOGITS
    've
    1.10
     reached
    0.96
     arrived
    0.94
     got
    0.92
     realise
    0.92
     finally
    0.92
     realize
    0.91
    're
    0.90
     realized
    0.90
     reaches
    0.90
    Act Density 0.145%

    No Known Activations