INDEX
    Explanations

    phrases that suggest denial or deflection regarding accountability

    New Auto-Interp
    Negative Logits
     typelib
    -0.64
    '}),
    -0.59
    __':
    -0.58
    /');
    -0.58
    ̍t
    -0.56
    )');
    -0.56
    Obviously
    -0.53
    ,:);
    -0.52
    %"),
    -0.52
    "}";
    -0.52
    POSITIVE LOGITS
    titleMargin
    0.68
     étoit
    0.67
     hendes
    0.63
    Fordítás
    0.62
     resourceCulture
    0.61
    erkek
    0.57
    ọng
    0.57
     femininas
    0.57
     elä
    0.57
    paravant
    0.54
    Act Density 0.103%

    No Known Activations