INDEX
    Explanations

    negations and terms related to absence or prohibition

    New Auto-Interp
    Negative Logits
    ogh
    -0.19
     sais
    -0.15
    agh
    -0.15
    romium
    -0.15
    ToWorld
    -0.15
    yre
    -0.14
     Swords
    -0.14
    lea
    -0.14
    лини
    -0.14
     sank
    -0.14
    POSITIVE LOGITS
    اÙĨÙĩ
    0.16
    bian
    0.15
    iddle
    0.15
    gle
    0.15
    moth
    0.14
    ãĤīãģĦ
    0.14
    zÄĻ
    0.14
     Monterey
    0.14
    AFX
    0.14
    KI
    0.14
    Act Density 0.001%

    No Known Activations