INDEX
    Explanations

    references to menu-related concepts

    New Auto-Interp
    Negative Logits
    -0.68
     (
    -0.63
    ,
    -0.61
     in
    -0.57
     a
    -0.56
    .
    -0.54
    ↵↵
    -0.54
     “
    -0.53
     I
    -0.53
    /
    -0.51
    POSITIVE LOGITS
     pleaſure
    1.33
    AsUp
    1.20
     Theſe
    1.13
    RectangleBorder
    1.12
     myſelf
    1.10
     themſelves
    1.10
     houſe
    1.09
     greateſt
    1.08
     Jefus
    1.05
     للمعارف
    1.04
    Act Density 0.119%

    No Known Activations