INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    initializeApp
    -0.54
     DET
    -0.54
     Polen
    -0.53
     UnityEngine
    -0.53
    なぎ
    -0.51
     kü
    -0.51
     strang
    -0.50
     Blu
    -0.50
     Kü
    -0.49
    CodeGen
    -0.47
    POSITIVE LOGITS
     ?>/
    1.07
    +"/
    1.01
    {}/
    0.96
     /\.
    0.93
     '/
    0.93
    ="/
    0.91
    (`/
    0.90
    "]/
    0.90
    ('/
    0.89
    :'/
    0.89
    Act Density 0.368%

    No Known Activations