INDEX
    Explanations

    educated person, colored spots, AI agent

    New Auto-Interp
    Negative Logits
    aginaw
    0.59
    eal
    0.55
    araham
    0.54
    pygame
    0.52
    aros
    0.52
    discard
    0.51
     วิชา
    0.50
     உண்மைய
    0.50
    kgs
    0.50
    algebras
    0.49
    POSITIVE LOGITS
    OTE
    0.51
    ,
    0.47
    0.46
    _
    0.43
     Y
    0.42
    0.42
     particulière
    0.41
     J
    0.40
    ‌ها
    0.40
     executive
    0.40
    Act Density 0.001%

    No Known Activations