INDEX
    Explanations

    I/we followed by informal verbs

    New Auto-Interp
    Negative Logits
    Additionally
    0.63
    此外
    0.55
    较为
    0.51
     additional
    0.51
     Additionally
    0.49
    additional
    0.49
    Additional
    0.48
     एवं
    0.47
    Furthermore
    0.46
     অপরদিকে
    0.46
    POSITIVE LOGITS
     didn
    0.92
     dunno
    0.86
     gotta
    0.84
     kinda
    0.80
     probably
    0.79
     gonna
    0.78
     hadn
    0.77
     couldn
    0.77
     ain
    0.77
     wasn
    0.77
    Act Density 0.293%

    No Known Activations