INDEX
    Explanations

    names or terms related to specific people or locations

    proper nouns related to names and places

    New Auto-Interp
    Negative Logits
    cerning
    -0.60
    ocating
    -0.57
    ulative
    -0.56
    ancies
    -0.56
    lycer
    -0.55
    ifty
    -0.54
    atures
    -0.54
    anchester
    -0.53
    arine
    -0.53
    rius
    -0.52
    POSITIVE LOGITS
    shit
    0.56
    bash
    0.54
    vals
    0.52
     Lines
    0.52
    ¥µ
    0.51
    zon
    0.50
    icho
    0.50
    Redditor
    0.50
    scl
    0.50
    strap
    0.50
    Act Density 0.990%

    No Known Activations