INDEX
    Explanations

    specific names, likely representing people or significant figures in the text

    New Auto-Interp
    Negative Logits
    bsolute
    -0.16
    angkan
    -0.15
    ÅĻÃŃzenÃŃ
    -0.14
    داÙĨÙĦÙĪØ¯
    -0.14
    esteem
    -0.13
    aimassage
    -0.13
    رخ
    -0.13
    luž
    -0.13
    lords
    -0.13
    ülük
    -0.13
    POSITIVE LOGITS
    .'
    0.14
    -chan
    0.14
     Singh
    0.14
    .’
    0.14
    &A
    0.13
    &C
    0.13
    .J
    0.13
    —who
    0.13
    ately
    0.13
    .K
    0.13
    Act Density 0.098%

    No Known Activations