INDEX
    Explanations

    names of characters or individuals

    mentions of specific character names from popular media

    New Auto-Interp
    Negative Logits
    ocrats
    -0.78
    utory
    -0.77
    -+-+
    -0.74
    anooga
    -0.73
    ENTION
    -0.70
    ocracy
    -0.69
    é¾įå¥ij士
    -0.69
    urated
    -0.68
     Dominion
    -0.68
    otaur
    -0.66
    POSITIVE LOGITS
    kj
    1.20
     Myster
    1.01
    ldom
    0.85
     Rey
    0.84
    ulic
    0.84
    senal
    0.79
    zin
    0.74
    uling
    0.72
    rio
    0.70
    issance
    0.69
    Act Density 0.009%

    No Known Activations