INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     defaultstate
    -0.59
    matchCondition
    -0.55
    OGND
    -0.52
    MigrationBuilder
    -0.52
    TestingModule
    -0.51
    iostream
    -0.51
    waitKey
    -0.48
     iconTwitter
    -0.48
     InputDecoration
    -0.48
    UserScript
    -0.48
    POSITIVE LOGITS
     off
    1.33
     Off
    1.14
     OFF
    1.11
    Off
    1.09
    off
    1.06
    OFF
    0.88
    toff
    0.70
     オフ
    0.68
     offs
    0.68
    offs
    0.64
    Act Density 0.005%

    No Known Activations