INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Diſ
    -0.77
     Efq
    -0.74
     Reſ
    -0.74
     Anſ
    -0.73
     Theſe
    -0.72
     ویکی‌پدیا
    -0.71
     purpoſe
    -0.70
     whoſe
    -0.69
     Conſ
    -0.68
     auffi
    -0.67
    POSITIVE LOGITS
    出版年
    0.57
    AnchorStyles
    0.54
    onzo
    0.50
     Tun
    0.50
    وفاته
    0.46
     EconPapers
    0.44
    expandindo
    0.42
    archy
    0.42
    Enllaces
    0.41
    Einzelnachweise
    0.40
    Act Density 0.510%

    No Known Activations