INDEX
    Explanations

    names of people or places

    words and phrases that denote entities, particularly names and locations

    New Auto-Interp
    Negative Logits
     Caps
    -0.93
     Catalyst
    -0.90
     Bers
    -0.88
     Baird
    -0.88
     Das
    -0.87
     Amp
    -0.86
     Bash
    -0.85
     Pug
    -0.84
    Cap
    -0.82
     ASP
    -0.81
    POSITIVE LOGITS
    loe
    0.96
    walker
    0.87
     lo
    0.83
    Ho
    0.82
    iel
    0.79
    que
    0.79
    hello
    0.78
     walking
    0.77
     walked
    0.76
    HO
    0.76
    Act Density 0.373%

    No Known Activations