INDEX
    Explanations

    phrases reflecting nuanced opinions or contrasts regarding experiences and perceptions

    New Auto-Interp
    Negative Logits
     sice
    -0.83
     zwar
    -0.78
     therefore
    -0.77
    therefore
    -0.71
    thus
    -0.70
     infatti
    -0.70
    λοι
    -0.67
    ailleurs
    -0.63
     thus
    -0.61
    เลย
    -0.58
    POSITIVE LOGITS
     also
    1.13
     nonetheless
    1.07
     ändå
    1.05
     nevertheless
    1.02
     digress
    0.98
    一方で
    0.97
     samtidigt
    0.96
    却是
    0.96
     também
    0.96
     también
    0.95
    Act Density 0.818%

    No Known Activations