INDEX
    Explanations

    words related to behavior or actions

    variations of the word "behave" in different contexts

    New Auto-Interp
    Negative Logits
    fram
    -0.77
    andel
    -0.71
    lake
    -0.68
    pelling
    -0.67
     Solo
    -0.66
    fer
    -0.65
     landing
    -0.65
    ondo
    -0.64
     Herz
    -0.64
    export
    -0.63
    POSITIVE LOGITS
    uate
    1.01
    iments
    0.88
     err
    0.86
     behavi
    0.85
     behaves
    0.85
    uated
    0.84
    uations
    0.84
     differently
    0.82
    ativity
    0.81
     behave
    0.81
    Act Density 0.029%

    No Known Activations