INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ModelExpression
    -1.12
     Majefty
    -1.11
     Efq
    -1.04
     raiſ
    -0.94
     Chriftian
    -0.94
     Monfieur
    -0.93
     myſelf
    -0.93
    Билгалдахарш
    -0.91
     dafs
    -0.90
    LookAnd
    -0.88
    POSITIVE LOGITS
     a
    0.85
     very
    0.77
     good
    0.72
     difficult
    0.70
     really
    0.67
     an
    0.62
     only
    0.62
     great
    0.62
     Bel
    0.62
     easy
    0.61
    Act Density 0.206%

    No Known Activations