INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     törvény
    -0.41
    -0.39
    <bos>
    -0.38
    }\|
    -0.36
    trex
    -0.34
    وئ
    -0.33
     której
    -0.33
     dispositivi
    -0.32
     schnee
    -0.32
     tecnologias
    -0.32
    POSITIVE LOGITS
     pop
    1.86
    pop
    1.21
     Pop
    1.18
    Pop
    1.15
     pops
    1.00
     popup
    0.98
     POP
    0.96
     popped
    0.96
     popping
    0.87
    ポップ
    0.87
    Act Density 0.007%

    No Known Activations