INDEX
    Explanations

    repeated references to the same concepts or entities

    New Auto-Interp
    Negative Logits
     Савезне
    -0.55
    ftagPool
    -0.52
     s
    -0.48
     good
    -0.47
    '][]
    -0.47
     basic
    -0.47
     \&
    -0.46
     etc
    -0.46
     is
    -0.45
     in
    -0.45
    POSITIVE LOGITS
     ſche
    0.95
     tartalomajánló
    0.91
     ſtate
    0.89
     myſelf
    0.88
     Efq
    0.86
     ſy
    0.86
     fubject
    0.85
     itſelf
    0.84
     raiſ
    0.84
     theſe
    0.83
    Act Density 0.563%

    No Known Activations