INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     other
    -1.13
    other
    -0.96
     autres
    -0.96
     Other
    -0.91
    Other
    -0.90
     OTHER
    -0.85
     otros
    -0.84
    autres
    -0.84
     otras
    -0.83
     outras
    -0.79
    POSITIVE LOGITS
    PreferredItem
    0.56
    ImageContext
    0.56
    hithe
    0.54
     purpoſe
    0.54
    warded
    0.52
     ſtate
    0.48
     hyal
    0.47
    wiſe
    0.47
    adecimal
    0.47
    uldades
    0.46
    Act Density 0.195%

    No Known Activations