INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ſeveral
    -0.88
     whoſe
    -0.86
     you
    -0.82
     myſelf
    -0.79
     itſelf
    -0.77
     ſuch
    -0.75
     themſelves
    -0.75
     Monfieur
    -0.75
     theſe
    -0.73
     ſta
    -0.71
    POSITIVE LOGITS
     تضيفلها
    0.65
    '
    0.63
    حياته
    0.58
     are
    0.57
    ьаж
    0.53
    tubers
    0.52
    Assad
    0.50
    verwijspagina
    0.50
    出版年
    0.48
    .*;
    0.47
    Act Density 0.069%

    No Known Activations