INDEX
    Explanations

    references to awards and recognitions in a professional context

    New Auto-Interp
    Negative Logits
    _neurons
    -0.15
    orsi
    -0.15
    ught
    -0.14
    rai
    -0.14
     Quiz
    -0.14
    cÃŃm
    -0.14
     Nicholson
    -0.14
    uffy
    -0.14
    oref
    -0.13
     McMahon
    -0.13
    POSITIVE LOGITS
     short
    0.65
    short
    0.54
     Short
    0.51
    -short
    0.50
    Short
    0.50
     SHORT
    0.47
    SHORT
    0.44
    _short
    0.44
    .short
    0.42
    çŁŃ
    0.41
    Act Density 0.100%

    No Known Activations