INDEX
    Explanations

    adjectives or phrases related to correctness or appropriateness

    instances of the word "appropriate" and its context in relation to behavior or actions

    New Auto-Interp
    Negative Logits
    chet
    -0.87
    plane
    -0.79
    planes
    -0.78
    glass
    -0.78
    urger
    -0.77
    cher
    -0.76
    ker
    -0.75
    cipl
    -0.74
    stead
    -0.73
    peak
    -0.73
    POSITIVE LOGITS
    tarian
    0.85
     punishment
    0.84
     Dragonbound
    0.83
     circumstances
    0.82
     punishments
    0.80
     amounts
    0.80
     sized
    0.78
     responses
    0.78
    appropriate
    0.76
     attire
    0.74
    Act Density 0.030%

    No Known Activations