INDEX
    Explanations

    First, calculations, sums

    New Auto-Interp
    Negative Logits
     misschien
    1.10
     something
    1.02
     fosters
    1.01
     WHY
    0.99
     pioneered
    0.98
     maybe
    0.96
     truly
    0.96
     profoundly
    0.93
     perhaps
    0.92
     stories
    0.92
    POSITIVE LOGITS
    首先
    2.03
     首先
    1.67
    Firstly
    1.55
    Analyzing
    1.50
    According
    1.47
     Firstly
    1.45
    We
    1.43
     firstly
    1.42
    Calculation
    1.41
    First
    1.40
    Act Density 0.532%

    No Known Activations