INDEX
    Explanations

    mentions of specific names or proper nouns

    names and references to specific individuals or entities

    New Auto-Interp
    Negative Logits
    rum
    -0.84
    mented
    -0.81
    matically
    -0.80
     glim
    -0.77
     shenan
    -0.73
    rers
    -0.71
    ging
    -0.71
    fully
    -0.71
     pse
    -0.70
    ged
    -0.70
    POSITIVE LOGITS
    tera
    0.92
    terday
    0.73
    alon
    0.72
    xual
    0.72
     Centauri
    0.71
    bian
    0.70
    oulos
    0.68
    odus
    0.67
    ylon
    0.67
    uclear
    0.65
    Act Density 0.028%

    No Known Activations