INDEX
    Explanations

    phrases that start with "non" or are related to non-classical or non-invasive topics

    New Auto-Interp
    Negative Logits
    DockStyle
    -0.91
     Vikipedi
    -0.82
     rumahnya
    -0.82
     italienne
    -0.81
     Theſe
    -0.81
     itſelf
    -0.80
     ainfi
    -0.80
    myModal
    -0.80
     Beſ
    -0.79
    发表于
    -0.79
    POSITIVE LOGITS
     non
    2.00
     Non
    1.91
    Non
    1.89
    non
    1.81
     NON
    1.81
    NON
    1.60
    1.53
     nons
    1.26
     nong
    1.24
     Nons
    1.23
    Act Density 0.095%

    No Known Activations