INDEX
    Explanations

    phrases related to habitual tendencies or behaviors

    phrases that describe tendencies or behavioral patterns

    New Auto-Interp
    Negative Logits
    gur
    -0.76
    yz
    -0.74
    arta
    -0.71
    lain
    -0.69
    ZA
    -0.64
    ania
    -0.62
    zbek
    -0.60
    fil
    -0.60
    ft
    -0.59
    KY
    -0.59
    POSITIVE LOGITS
    rils
    1.36
    entious
    1.18
    ril
    1.06
    erer
    0.88
    erers
    0.86
    erest
    0.84
     toward
    0.84
     towards
    0.80
    entimes
    0.80
    eman
    0.79
    Act Density 0.018%

    No Known Activations