INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Mixer
    -0.52
     ister
    -0.42
     Gorman
    -0.38
     OpenAPI
    -0.38
     Punj
    -0.38
     ferrocarril
    -0.36
    Voltar
    -0.36
    écri
    -0.35
     serons
    -0.35
     vectorielle
    -0.35
    POSITIVE LOGITS
     Jade
    2.50
    Jade
    2.31
     jade
    2.27
    jade
    1.88
    1.13
     Jad
    1.00
     玉
    0.98
    Jad
    0.93
    翡翠
    0.89
     jad
    0.83
    Act Density 0.001%

    No Known Activations