INDEX
    Explanations

    positive evaluations and overall impressions of books

    New Auto-Interp
    Negative Logits
     FO
    -0.17
    FO
    -0.16
    itional
    -0.14
    mong
    -0.14
     Fo
    -0.14
    eger
    -0.14
    ubo
    -0.14
    лÑıд
    -0.14
    WER
    -0.14
    anova
    -0.14
    POSITIVE LOGITS
    etas
    0.15
    okus
    0.15
    ëĮĢíĸī
    0.15
    osten
    0.14
     Companion
    0.14
    ÌĨ
    0.14
    oward
    0.14
    oland
    0.14
    adows
    0.14
    ijke
    0.14
    Act Density 0.078%

    No Known Activations