INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     magnétique
    -0.63
     informée
    -0.63
     polícia
    -0.57
     Murillo
    -0.56
     Klicken
    -0.56
     dourado
    -0.55
     lápis
    -0.55
     agência
    -0.54
     vectorielle
    -0.54
     złota
    -0.54
    POSITIVE LOGITS
     dessert
    1.99
     Dessert
    1.88
    Dessert
    1.80
    dessert
    1.75
     desserts
    1.58
     Desserts
    1.47
    esserts
    1.23
     десер
    1.08
    デザート
    1.00
     desert
    0.95
    Act Density 0.001%

    No Known Activations