INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     shape
    -0.08
    _sq
    -0.08
    yaw
    -0.07
     tonnes
    -0.06
    oundingBox
    -0.06
    -0.06
     велик
    -0.06
    orns
    -0.06
    _START
    -0.06
    _BUF
    -0.06
    POSITIVE LOGITS
    0.07
    emotion
    0.07
     مدیر
    0.06
    0.06
    	initialize
    0.06
     λεπ
    0.06
    _RECE
    0.06
     iddi
    0.06
    DisplayStyle
    0.06
    0.06
    Act Density 0.033%

    No Known Activations