INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Kar
    -1.27
    kar
    -1.20
    Kar
    -1.16
     kar
    -1.15
    KAR
    -1.09
     KAR
    -0.93
    kari
    -0.88
     Kari
    -0.83
     Karol
    -0.81
    kara
    -0.81
    POSITIVE LOGITS
     chré
    0.68
     complètes
    0.64
     giapp
    0.63
     vœux
    0.60
     Efq
    0.60
     fondament
    0.57
     respectivement
    0.57
     ferons
    0.57
     épais
    0.57
     démocr
    0.56
    Act Density 0.015%

    No Known Activations