INDEX
    Explanations

    phrases indicating causal relationships or consequences

    New Auto-Interp
    Negative Logits
     Houſe
    -0.61
     Вікіпе
    -0.53
    Tikang
    -0.51
    出版年
    -0.51
     Савезне
    -0.49
     Anſ
    -0.49
     perſon
    -0.49
     שוליים
    -0.49
     ſind
    -0.49
     nahilalakip
    -0.49
    POSITIVE LOGITS
     Dadurch
    0.45
    bootstrapcdn
    0.45
     consequently
    0.42
     daardoor
    0.41
     resulting
    0.41
     conseguenza
    0.39
     Consequently
    0.39
    resulting
    0.38
     Deshalb
    0.37
    显得
    0.35
    Act Density 1.083%

    No Known Activations