INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     LUT
    -0.73
     Efq
    -0.72
     myſelf
    -0.69
     leaſt
    -0.67
     itſelf
    -0.67
     raiſ
    -0.67
     pleaſure
    -0.65
     Reſ
    -0.65
     Fins
    -0.62
     Tuff
    -0.60
    POSITIVE LOGITS
     Newspapers
    1.03
     newspapers
    1.03
     Newspaper
    0.84
    newspaper
    0.83
     newspaper
    0.80
    Newspaper
    0.78
     da
    0.76
    zeitung
    0.73
     journaux
    0.65
    报纸
    0.65
    Act Density 0.068%

    No Known Activations