INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     and
    -0.57
    -0.56
    '
    -0.52
     '
    -0.52
     ‘
    -0.50
     O
    -0.48
    </em>
    -0.47
     in
    -0.47
     "
    -0.47
     \
    -0.46
    POSITIVE LOGITS
     Anſ
    1.05
    Portale
    0.97
    ſelves
    0.96
     Reſ
    0.93
     Eſ
    0.92
     ſche
    0.90
     OFDb
    0.90
     متعلقه
    0.89
     Савезне
    0.89
     raiſ
    0.89
    Act Density 0.069%

    No Known Activations