INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Walker
    -0.08
    න්ද
    -0.08
    pillar
    -0.08
     Stephan
    -0.08
     punct
    -0.07
    Steph
    -0.07
     negras
    -0.07
    ห์
    -0.07
    MJ
    -0.07
    -0.07
    POSITIVE LOGITS
    .weapon
    0.11
    0.09
    одав
    0.09
     Weapons
    0.08
     arsenal
    0.08
    措施
    0.08
     हथ
    0.08
    U
    0.07
     Mel
    0.07
    0.07
    Act Density 0.016%

    No Known Activations