INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     between
    -2.02
     BETWEEN
    -1.84
     Between
    -1.84
    Between
    -1.83
     mellan
    -1.79
    between
    -1.74
     mellom
    -1.71
     mellem
    -1.66
     tussen
    -1.62
     betwixt
    -1.62
    POSITIVE LOGITS
     the
    1.02
    0.81
     her
    0.73
     a
    0.73
     The
    0.69
     M
    0.68
     his
    0.66
     "
    0.65
     “
    0.64
     those
    0.63
    Act Density 0.100%

    No Known Activations