INDEX
    Explanations

    abstract structures and organization within documents

    New Auto-Interp
    Negative Logits
     greateſt
    -1.07
     itſelf
    -1.03
     purpoſe
    -1.03
     myſelf
    -1.01
     <<<<<<<<<<<<<<
    -0.97
     themſelves
    -0.95
     pleaſure
    -0.93
     houſe
    -0.90
     ſche
    -0.89
     ſever
    -0.89
    POSITIVE LOGITS
    ريكا
    0.61
     l
    0.59
     r
    0.59
    2
    0.54
     res
    0.53
     d
    0.53
     Hu
    0.52
    0.51
     b
    0.51
     n
    0.51
    Act Density 0.600%

    No Known Activations