INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ┈┈
    -0.34
    abestanden
    -0.34
     referenties
    -0.31
     Seelen
    -0.31
    owią
    -0.30
    msgTypes
    -0.29
    Ligações
    -0.28
     zas
    -0.28
    forsk
    -0.28
     dignidad
    -0.28
    POSITIVE LOGITS
     Head
    2.03
     head
    1.94
    Head
    1.77
     HEAD
    1.52
     Heads
    1.43
     heads
    1.39
    Heads
    1.19
    HEAD
    1.07
     tête
    1.07
    ヘッド
    1.01
    Act Density 0.002%

    No Known Activations