INDEX
    Explanations

    references to counter-related concepts or activities

    occurrences of the word "counter" in various contexts

    New Auto-Interp
    Negative Logits
     Forge
    -0.79
     ABE
    -0.78
     Known
    -0.75
    doms
    -0.73
     Pelicans
    -0.72
    OOD
    -0.72
     Roche
    -0.72
     Tornado
    -0.70
     Cursed
    -0.68
     Pistons
    -0.67
    POSITIVE LOGITS
    measures
    1.23
    intuitive
    1.16
    attack
    1.13
    intelligence
    1.13
    balance
    1.10
    counter
    1.09
    offensive
    1.03
    ror
    1.00
    fact
    1.00
    balanced
    0.98
    Act Density 0.016%

    No Known Activations