INDEX
    Explanations

    phrases indicating consequences or hypothetical situations

    New Auto-Interp
    Negative Logits
    rette
    -0.15
    unos
    -0.14
     Maj
    -0.14
    ker
    -0.14
    dden
    -0.14
    .scalablytyped
    -0.13
     maj
    -0.13
    _defaults
    -0.13
    itez
    -0.13
     excell
    -0.13
    POSITIVE LOGITS
     would
    0.40
    Would
    0.35
     Would
    0.35
    would
    0.34
     wouldn
    0.31
     zou
    0.26
     Wouldn
    0.25
     serait
    0.25
     skulle
    0.24
     würde
    0.23
    Act Density 0.231%

    No Known Activations