INDEX
    Explanations

    however, but, however, but contrast

    New Auto-Interp
    Negative Logits
    enk
    0.47
    idez
    0.46
    zen
    0.44
    pan
    0.44
    ായിരുന്നു
    0.44
     iii
    0.42
     питань
    0.41
    ardon
    0.41
     aforesaid
    0.40
     conclusão
    0.40
    POSITIVE LOGITS
     smaller
    0.49
     هنوز
    0.47
     kleinere
    0.46
     grundsätzlich
    0.46
     weaker
    0.45
    整体
    0.45
     최근
    0.45
     generally
    0.45
    整體
    0.45
     البعض
    0.43
    Act Density 0.002%

    No Known Activations