INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    UnitTesting
    -0.86
    OGND
    -0.81
    -0.77
    +#+#
    -0.73
     onlyOwner
    -0.73
     defaultstate
    -0.72
    anthene
    -0.70
     Administrativna
    -0.69
     وتسجيلات
    -0.66
    ScopeManager
    -0.65
    POSITIVE LOGITS
     even
    0.84
     Even
    0.67
    Even
    0.64
    even
    0.63
     حتی
    0.60
     siquiera
    0.56
     incluso
    0.54
     EVEN
    0.54
     даже
    0.52
    Incluso
    0.49
    Act Density 0.005%

    No Known Activations