INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Aff
    -0.07
     Notification
    -0.07
     Shock
    -0.07
     foc
    -0.07
     Has
    -0.07
    -0.07
     Fan
    -0.07
     Erd
    -0.06
     Magazine
    -0.06
    Handlers
    -0.06
    POSITIVE LOGITS
     extraordin
    0.08
    נקוד
    0.07
    .ReadAllText
    0.07
    خيارات
    0.07
     slut
    0.07
     assassin
    0.06
    0.06
    0.06
     이번
    0.06
    一艘
    0.06
    Act Density 0.045%

    No Known Activations