INDEX
    Explanations

    alternative actions or choices

    the phrase "instead" indicating alternative actions or perspectives

    New Auto-Interp
    Negative Logits
    vez
    -0.67
     neighbourhood
    -0.66
    aph
    -0.65
     foundations
    -0.65
    Que
    -0.60
    dies
    -0.60
    dirty
    -0.60
     derby
    -0.59
    ties
    -0.59
     foundation
    -0.58
    POSITIVE LOGITS
    ctr
    0.74
    zbek
    0.71
    ortun
    0.69
    replace
    0.69
    chart
    0.69
    heses
    0.68
     opting
    0.66
    terness
    0.64
    ertodd
    0.63
    hesis
    0.63
    Act Density 0.022%

    No Known Activations