INDEX
    Explanations

    names, particularly in constructions that contain "Turn"

    New Auto-Interp
    Negative Logits
    è¦ļéĨĴ
    -0.87
    mma
    -0.70
    ropolitan
    -0.69
    etitive
    -0.66
    riad
    -0.65
     Mellon
    -0.64
    lain
    -0.63
    capacity
    -0.63
    ording
    -0.63
     Annotations
    -0.62
    POSITIVE LOGITS
    coat
    0.80
    Ī
    0.74
    buck
    0.73
    ĸ
    0.72
     into
    0.70
    ¸
    0.70
     crank
    0.70
     sour
    0.70
     inward
    0.70
     beet
    0.70
    Act Density 3.586%

    No Known Activations