INDEX
    Explanations

    names of people and locations, especially related to historical events

    New Auto-Interp
    Negative Logits
     somet
    -0.22
     Swordsman
    -0.21
    pole
    -0.21
    ulhu
    -0.20
     isEnabled
    -0.20
    inosaur
    -0.20
     scales
    -0.20
     epile
    -0.19
    ensor
    -0.19
    ritic
    -0.19
    POSITIVE LOGITS
    lain
    0.23
    hua
    0.23
    gar
    0.22
    ira
    0.22
    aii
    0.22
    lla
    0.22
    edu
    0.22
     Blanc
    0.21
    gars
    0.21
    ously
    0.21
    Act Density 14.666%

    No Known Activations