INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ;'>
    -0.07
     polygons
    -0.07
     Jihad
    -0.07
     granny
    -0.07
     obyvatel
    -0.07
    .TRA
    -0.06
     Bella
    -0.06
     nv
    -0.06
     comerc
    -0.06
    -0.06
    POSITIVE LOGITS
    mentation
    0.07
     affirmation
    0.06
    imers
    0.06
     humming
    0.06
     affirm
    0.06
     assures
    0.06
     ili
    0.06
    joining
    0.06
    515
    0.06
     medieval
    0.06
    Act Density 0.003%

    No Known Activations