INDEX
    Explanations

    proper nouns related to individuals, particularly names

    references to specific individuals and brands

    New Auto-Interp
    Negative Logits
    ences
    -0.82
    ential
    -0.81
    enced
    -0.68
    --------------------------------------------------------
    -0.67
    LINE
    -0.64
    tein
    -0.63
    swick
    -0.63
    body
    -0.63
    PATH
    -0.63
    ilee
    -0.62
    POSITIVE LOGITS
     Mats
    1.26
    ushima
    1.09
    ura
    1.00
     mats
    0.98
    wana
    0.79
     misunder
    0.78
    awan
    0.77
    uchin
    0.76
    aido
    0.75
     Kats
    0.73
    Act Density 0.009%

    No Known Activations