INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     VARIABLES
    -0.07
     estar
    -0.06
     runway
    -0.06
    اشت
    -0.06
    ype
    -0.06
    iking
    -0.06
     Structures
    -0.06
    eza
    -0.06
    Both
    -0.06
    castle
    -0.06
    POSITIVE LOGITS
     ресур
    0.07
     mohli
    0.07
    transparent
    0.07
    ortal
    0.06
     brutal
    0.06
    mouseup
    0.06
    _artist
    0.06
     Et
    0.06
     Bakanlığı
    0.06
     Forg
    0.06
    Act Density 0.042%

    No Known Activations