INDEX
    Explanations

    verbs related to actions or behaviors

    New Auto-Interp
    Negative Logits
    mare
    -0.74
    lights
    -0.67
    mares
    -0.64
    antha
    -0.59
     Lank
    -0.58
     Methods
    -0.55
     Statistics
    -0.54
    )=(
    -0.53
     Canal
    -0.53
     Azerb
    -0.53
    POSITIVE LOGITS
    omsday
    0.97
    ppel
    0.88
    le
    0.84
    pez
    0.82
    xx
    0.81
     justice
    0.80
    lez
    0.79
     laundry
    0.78
    omething
    0.77
    herty
    0.76
    Act Density 0.095%

    No Known Activations