INDEX
    Explanations

    animal actions and outcomes

    New Auto-Interp
    Negative Logits
     appunto
    0.51
     Seriously
    0.44
    Seriously
    0.44
    なのです
    0.43
    Donc
    0.42
     portanto
    0.42
     म्हणूनच
    0.41
     właśnie
    0.41
     именно
    0.40
    그래서
    0.40
    POSITIVE LOGITS
     hingegen
    0.98
     ebenfalls
    0.92
     natomiast
    0.91
     similarly
    0.86
     likewise
    0.85
    同樣
    0.80
    同样
    0.76
     dagegen
    0.74
     kolei
    0.73
    こちらも
    0.73
    Act Density 0.022%

    No Known Activations