INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    KeepOriginal
    0.39
    евна
    0.38
     दूरी
    0.38
     সরাসরি
    0.37
    ovaná
    0.36
     ঢেউ
    0.35
     draft
    0.35
    截止
    0.35
    tcpHeader
    0.35
     recid
    0.34
    POSITIVE LOGITS
     initialization
    1.63
     Initialization
    1.52
     initialize
    1.51
    初始化
    1.51
    Initialization
    1.50
     initializes
    1.50
     initializing
    1.49
     Initialize
    1.48
     初始化
    1.48
    Initialize
    1.42
    Act Density 0.042%

    No Known Activations