INDEX
    Explanations

    phrases related to personal integrity and relationships

    after "who," "what," "it," or "the."

    New Auto-Interp
    Negative Logits
     rid
    -0.73
     ten
    -0.73
     hom
    -0.73
     un
    -0.70
     now
    -0.69
     fun
    -0.67
     do
    -0.67
     per
    -0.67
     minimal
    -0.67
     me
    -0.66
    POSITIVE LOGITS
    selves
    0.73
     And
    0.68
     pregunto
    0.65
     tasche
    0.65
     alemania
    0.65
     tiroirs
    0.65
     cœurs
    0.63
     stratég
    0.63
     regeringen
    0.62
     anún
    0.62
    Act Density 0.212%

    No Known Activations