INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     kitten
    -0.08
     alternatief
    -0.08
     Gloucester
    -0.08
     alternatively
    -0.08
     alternativ
    -0.08
     combina
    -0.08
     esquema
    -0.08
     approx
    -0.07
     candy
    -0.07
     countryside
    -0.07
    POSITIVE LOGITS
     kesel
    0.08
    Inherited
    0.08
    耀
    0.08
     forwarded
    0.08
     applied
    0.07
     passt
    0.07
     desider
    0.07
     attributable
    0.07
    0.07
     سواء
    0.07
    Act Density 0.007%

    No Known Activations