INDEX
    Explanations

    words related to making decisions or assessments

    New Auto-Interp
    Negative Logits
    bery
    -0.17
    ilities
    -0.16
    uary
    -0.16
    oms
    -0.15
    ses
    -0.15
    orman
    -0.15
    iler
    -0.14
    ove
    -0.14
    iev
    -0.14
    ilitating
    -0.14
    POSITIVE LOGITS
    316
    0.15
    ants
    0.15
     loose
    0.15
     whether
    0.15
    ffen
    0.15
    expr
    0.15
    lich
    0.15
    antt
    0.14
    esub
    0.14
    angling
    0.14
    Act Density 0.022%

    No Known Activations