INDEX
    Explanations

    significant nouns and verbs indicating ongoing actions or events

    New Auto-Interp
    Negative Logits
    ione
    -0.17
    intree
    -0.17
    á»ĵ
    -0.15
    .writeString
    -0.15
    ION
    -0.15
    миниÑģÑĤÑĢа
    -0.15
    cih
    -0.15
    946
    -0.14
    alf
    -0.14
    usch
    -0.14
    POSITIVE LOGITS
    bjerg
    0.19
    quam
    0.17
    acket
    0.15
     Heller
    0.14
    ktor
    0.14
    à¹ĥà¸ļ
    0.14
     Jay
    0.14
    phies
    0.14
    heet
    0.14
    .fix
    0.14
    Act Density 0.001%

    No Known Activations