INDEX
    Explanations

    mentions of people's names

    New Auto-Interp
    Negative Logits
     margins
    -0.80
    llor
    -0.79
    CLASSIFIED
    -0.73
     minded
    -0.68
    ¥ŀ
    -0.68
    eering
    -0.65
     lone
    -0.65
    GoldMagikarp
    -0.65
     conspicuous
    -0.65
     actionGroup
    -0.64
    POSITIVE LOGITS
    zig
    1.17
    forth
    1.09
    ube
    0.96
    emark
    0.93
    alog
    0.92
    olver
    0.92
    ilo
    0.92
    coni
    0.92
    vier
    0.90
     Marino
    0.90
    Act Density 5.460%

    No Known Activations