INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     temples
    -0.09
     String
    -0.07
     ABI
    -0.06
     yoluyla
    -0.06
     share
    -0.06
    /the
    -0.06
    илася
    -0.06
     عالی
    -0.06
     کی
    -0.06
    rophic
    -0.06
    POSITIVE LOGITS
     Templ
    0.17
     templ
    0.14
     Temple
    0.13
     temple
    0.13
    templ
    0.10
    TEMPL
    0.08
    EMPL
    0.07
     Demp
    0.07
    τής
    0.07
     Sho
    0.07
    Act Density 0.005%

    No Known Activations