INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.44
    nių
    -0.41
    iorna
    -0.37
     monstruos
    -0.35
    mid
    -0.35
    ave
    -0.35
    dsl
    -0.35
    estat
    -0.35
    ROIT
    -0.35
    ¯
    -0.34
    POSITIVE LOGITS
     Efq
    0.97
     Jefus
    0.97
     Majefty
    0.91
     myſelf
    0.90
     Reſ
    0.88
     Perſ
    0.87
     itſelf
    0.86
     purpoſe
    0.85
     Theſe
    0.85
     himſelf
    0.81
    Act Density 0.047%

    No Known Activations