INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     그냥
    0.46
     exigir
    0.46
     yelling
    0.44
    我都
    0.44
    都要
    0.44
    家伙
    0.43
     asshole
    0.43
     ==>
    0.43
     crappy
    0.43
     lmao
    0.43
    POSITIVE LOGITS
     joined
    0.93
     immigrated
    0.84
     grew
    0.84
     emigrated
    0.80
     graduated
    0.77
     describes
    0.76
     studied
    0.74
     spent
    0.73
     credits
    0.73
     specializes
    0.73
    Act Density 0.073%

    No Known Activations