INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     رع
    -0.08
     vyrob
    -0.07
    xCA
    -0.07
     gameState
    -0.07
    .Xna
    -0.07
    gate
    -0.06
    CACHE
    -0.06
    _learning
    -0.06
    case
    -0.06
    -0.06
    POSITIVE LOGITS
    il
    0.08
    IL
    0.07
     Би
    0.07
     Phil
    0.07
    i
    0.07
    Phil
    0.07
    ічний
    0.07
     We
    0.07
    We
    0.07
    -N
    0.07
    Act Density 0.002%

    No Known Activations