INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    غال
    -0.08
     happier
    -0.07
    дут
    -0.07
     descendants
    -0.07
     tracking
    -0.06
     coherent
    -0.06
    _STRIP
    -0.06
     Tracking
    -0.06
     portrayal
    -0.06
    .Clear
    -0.06
    POSITIVE LOGITS
     novice
    0.14
     rookie
    0.09
     newbie
    0.08
     Rookie
    0.08
    elf
    0.07
     clazz
    0.06
    resher
    0.06
    types
    0.06
     rookies
    0.06
    σε
    0.06
    Act Density 0.007%

    No Known Activations