INDEX
    Explanations

    references to novels and their attributes

    New Auto-Interp
    Negative Logits
    fully
    -0.19
    aan
    -0.17
    wards
    -0.16
    ed
    -0.16
    yor
    -0.16
    fulness
    -0.15
    àµįà´
    -0.15
    543
    -0.15
    ugu
    -0.14
    %B
    -0.14
    POSITIVE LOGITS
    -length
    0.28
    ists
    0.25
    ty
    0.25
    istic
    0.25
    ization
    0.24
    lette
    0.24
    izations
    0.23
    ized
    0.23
    isation
    0.23
    ised
    0.21
    Act Density 0.014%

    No Known Activations