INDEX
    Explanations

    numbers followed by punctuation

    New Auto-Interp
    Negative Logits
    Any
    -0.49
     Any
    -0.43
     luckily
    -0.43
     fortunately
    -0.42
     upon
    -0.42
     if
    -0.41
    upon
    -0.41
     thankfully
    -0.41
     Cualquier
    -0.40
     amongst
    -0.40
    POSITIVE LOGITS
     In
    1.59
    In
    1.09
     وفي
    1.07
     În
    0.96
     On
    0.91
     ใน
    0.91
     וב
    0.84
     At
    0.82
    În
    0.70
     Pada
    0.66
    Act Density 1.020%

    No Known Activations