INDEX
    Explanations

    phrases indicating responsibility and accountability for actions or events

    New Auto-Interp
    Negative Logits
    RenderAtEndOf
    -0.53
     imageNamed
    -0.52
    moke
    -0.50
     TestBed
    -0.48
    iento
    -0.48
    komme
    -0.47
    ulkner
    -0.47
    degenerate
    -0.46
    Subjects
    -0.46
     الموا
    -0.45
    POSITIVE LOGITS
     role
    0.86
     contribution
    0.83
     Contribution
    0.80
     roles
    0.77
    Contribution
    0.76
     contributions
    0.76
    role
    0.73
     duties
    0.72
    贡献
    0.72
    貢献
    0.71
    Act Density 0.569%

    No Known Activations