INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     —
    -2.48
    -2.45
     Moreover
    -2.42
    -2.42
     jovial
    -2.39
     Furthermore
    -2.39
    -2.34
     Certainly
    -2.30
     Consequently
    -2.28
     Also
    -2.25
    POSITIVE LOGITS
    2.64
    2.56
     activado
    2.38
    2.38
     abiye
    2.34
    2.33
    2.33
    2.31
     vivió
    2.31
    学校の
    2.30
    Act Density 0.012%

    No Known Activations