INDEX
    Explanations

    references to behavior and behavioral changes

    New Auto-Interp
    Negative Logits
     Laurens
    -0.68
    risten
    -0.64
     Tup
    -0.62
     blest
    -0.61
     gzip
    -0.60
    NotNil
    -0.59
     Clooney
    -0.59
     geschlagen
    -0.59
     kurulan
    -0.59
     ciąży
    -0.59
    POSITIVE LOGITS
     behavior
    2.57
     behaviour
    2.40
     Behavior
    2.33
    behavior
    2.25
     behaviors
    2.20
     BEHAVIOR
    2.16
    Behavior
    2.10
     Behaviour
    2.10
     behaviours
    2.04
    behaviour
    2.03
    Act Density 0.106%

    No Known Activations