INDEX
    Explanations

    double 'o's or alternatively, the word "museum" or related terms

    words or utterances related to excitement or enjoyment

    New Auto-Interp
    Negative Logits
    代
    -0.85
     misunder
    -0.81
    ewski
    -0.71
     Luthor
    -0.71
    ÑĮ
    -0.67
    DonaldTrump
    -0.67
    imir
    -0.66
    nikov
    -0.65
     Integrity
    -0.63
    itates
    -0.62
    POSITIVE LOGITS
    zee
    1.00
    gee
    0.98
    gey
    0.98
    zing
    0.97
    zeb
    0.96
    zers
    0.95
    lean
    0.94
    ze
    0.94
    zer
    0.94
    ey
    0.93
    Act Density 0.033%

    No Known Activations