INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    3
    0.54
    arlama
    0.53
    temporal
    0.48
    amik
    0.47
    enças
    0.47
    λώ
    0.46
    ević
    0.46
    áž
    0.46
    ências
    0.45
    ć
    0.45
    POSITIVE LOGITS
     noved
    0.48
     forerunner
    0.45
     Gaines
    0.44
     building
    0.42
     precursor
    0.42
     detectable
    0.42
     Verbesser
    0.42
     helping
    0.42
     reigning
    0.41
     bounding
    0.41
    Act Density 0.003%

    No Known Activations