INDEX
    Explanations

    proper nouns, particularly names of people

    Dil, Til, Vil, Pir, Ein, Rid start words

    New Auto-Interp
    Negative Logits
     fubject
    -0.66
    Зноскі
    -0.65
    setViewportView
    -0.63
     Vibe
    -0.60
     Sparkle
    -0.59
    Autoritní
    -0.59
    Vue
    -0.58
     bättre
    -0.58
    aze
    -0.58
    Algo
    -0.57
    POSITIVE LOGITS
     Til
    0.83
    Til
    0.72
     Vil
    0.72
     Pir
    0.68
    Pir
    0.65
    Vil
    0.59
     Tum
    0.57
     Dil
    0.54
     Rid
    0.54
     Pil
    0.54
    Act Density 0.011%

    No Known Activations