INDEX
    Explanations

    programming code

    New Auto-Interp
    Negative Logits
     ENTITY
    -0.07
    LOWER
    -0.07
     speculate
    -0.07
     firefighters
    -0.07
    (effect
    -0.07
     آنها
    -0.07
     ffi
    -0.06
     Hat
    -0.06
     hlavou
    -0.06
     Employment
    -0.06
    POSITIVE LOGITS
    liga
    0.06
    ich
    0.06
    utura
    0.06
     mosques
    0.06
    nika
    0.06
     dvoj
    0.06
     voks
    0.06
    istros
    0.06
    ño
    0.06
    องท
    0.06
    Act Density 0.038%

    No Known Activations