INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     insanely
    0.49
     crappy
    0.42
     teilweise
    0.42
     দুইজন
    0.41
    有点
    0.41
     scary
    0.40
     gotta
    0.40
     сложно
    0.40
     частично
    0.40
    ್ರೆ
    0.39
    POSITIVE LOGITS
     wholly
    0.68
     entirely
    0.68
     fully
    0.66
     truly
    0.58
    完全
    0.57
     entièrement
    0.56
     pleinement
    0.56
     totalmente
    0.54
    完全に
    0.53
     wholeheartedly
    0.53
    Act Density 0.025%

    No Known Activations