INDEX
    Explanations

    understanding and knowledge

    New Auto-Interp
    Negative Logits
     Ambassador
    -0.08
     Arab
    -0.07
    стрел
    -0.07
    考察
    -0.07
     Moran
    -0.07
     Soldier
    -0.07
     Tiles
    -0.07
    _relationship
    -0.07
    vara
    -0.06
    严格
    -0.06
    POSITIVE LOGITS
    (cc
    0.07
     ogó
    0.07
    0.07
    leness
    0.06
     effects
    0.06
    ешь
    0.06
    .Sum
    0.06
     causa
    0.06
    買う
    0.06
    0.06
    Act Density 0.196%

    No Known Activations