INDEX
    Explanations

    and followed by determiner or pronoun

    New Auto-Interp
    Negative Logits
    And
    1.06
     And
    1.04
    На
    0.99
    并且
    0.95
    0.95
     Like
    0.94
     Therefore
    0.90
    Like
    0.89
    そのため
    0.89
    而且
    0.89
    POSITIVE LOGITS
     automatisch
    0.74
     segera
    0.72
     automáticamente
    0.70
     пусть
    0.68
     mijn
    0.67
     eventuali
    0.67
     justru
    0.67
     artık
    0.66
     setzen
    0.65
     ')')
    0.65
    Act Density 0.005%

    No Known Activations