INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     hygiene
    -0.07
     noop
    -0.07
    gio
    -0.07
     prob
    -0.06
    <Integer
    -0.06
    reme
    -0.06
    .nano
    -0.06
    annie
    -0.06
     novice
    -0.06
    090
    -0.06
    POSITIVE LOGITS
     cast
    0.19
     Cast
    0.19
    Cast
    0.17
    cast
    0.14
     casts
    0.14
     CAST
    0.13
     casting
    0.13
    CAST
    0.12
     Casting
    0.12
    casting
    0.11
    Act Density 0.013%

    No Known Activations