INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    adla
    -0.07
     Faction
    -0.06
     corridors
    -0.06
    -appointed
    -0.06
    Terr
    -0.06
    GameData
    -0.06
    Gender
    -0.06
    Sum
    -0.06
     الف
    -0.06
    •
    -0.06
    POSITIVE LOGITS
     main
    0.07
     це
    0.07
    0.06
    0.06
    (instance
    0.06
    ezier
    0.06
    _WP
    0.06
     сет
    0.06
     enforce
    0.06
     likely
    0.06
    Act Density 0.024%

    No Known Activations