INDEX
    Explanations

    mentions of the word "te" with a high level of activation

    instances of the word "te."

    New Auto-Interp
    Negative Logits
     eleph
    -0.71
    ipolar
    -0.65
    interrupted
    -0.65
     justified
    -0.63
     continents
    -0.62
    lessly
    -0.61
    etheless
    -0.61
    ĸļ
    -0.61
     pressures
    -0.61
     fronts
    -0.60
    POSITIVE LOGITS
    brate
    1.42
    brates
    1.30
    ller
    1.20
    achers
    1.12
    levision
    1.10
    llers
    1.08
    achable
    1.06
    legram
    1.04
    legraph
    1.03
    aching
    1.02
    Act Density 0.017%

    No Known Activations