INDEX
    Explanations

    directives for user interaction with content or websites

    New Auto-Interp
    Negative Logits
    ſelf
    -0.74
    ſelves
    -0.68
     ſont
    -0.65
     müſſen
    -0.61
     queſta
    -0.60
    الحياه
    -0.60
     ſich
    -0.59
     increí
    -0.59
    AxisAlignment
    -0.58
     ſur
    -0.57
    POSITIVE LOGITS
     Kim
    0.39
     irony
    0.38
     references
    0.37
    .
    0.36
     reference
    0.35
    m
    0.35
    attach
    0.35
    reference
    0.34
    BINARY
    0.34
     Moulin
    0.33
    Act Density 0.015%

    No Known Activations