INDEX
    Explanations

    proper nouns, specifically names

    the presence of the name "Sab" in various contexts

    New Auto-Interp
    Negative Logits
     Hawaiian
    -0.69
     fung
    -0.64
    人
    -0.62
     omission
    -0.61
     Underground
    -0.59
     tru
    -0.58
     Fargo
    -0.58
    SPONSORED
    -0.58
    ï¸ı
    -0.57
     killer
    -0.57
    POSITIVE LOGITS
    rina
    1.33
    ģ
    0.99
    arat
    0.98
    qua
    0.98
    onis
    0.96
    eway
    0.96
    ril
    0.91
    eways
    0.90
    riel
    0.89
    aram
    0.89
    Act Density 0.042%

    No Known Activations