INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Havre
    0.76
     விவர
    0.73
    rawl
    0.73
     কেন্দ
    0.69
    athons
    0.68
     उत्पा
    0.68
     алу
    0.66
     バッグ
    0.66
    CTED
    0.64
    לך
    0.64
    POSITIVE LOGITS
     boolean
    3.24
     Boolean
    3.21
    Boolean
    3.20
    boolean
    3.05
     bool
    3.02
    bool
    2.94
    Bool
    2.90
     Bool
    2.67
     BOOLE
    2.66
    BOOLE
    2.49
    Act Density 1.239%

    No Known Activations