INDEX
    Explanations

    references to agitation or frustration

    New Auto-Interp
    Negative Logits
     behavi
    -0.75
    untu
    -0.73
    ¥ŀ
    -0.71
    ciating
    -0.69
    quished
    -0.69
    terior
    -0.68
    zac
    -0.67
    agonist
    -0.67
    sterdam
    -0.66
    issance
    -0.64
    POSITIVE LOGITS
    fork
    1.05
    imaru
    1.01
    icago
    0.82
    IELD
    0.81
    y
    0.80
    cock
    0.77
    itch
    0.77
     Weaver
    0.72
     Cobb
    0.71
    acus
    0.68
    Act Density 0.028%

    No Known Activations