INDEX
    Explanations

    breakthrough/advancement

    New Auto-Interp
    Negative Logits
     RIPRODUZIONE
    -0.98
    uxxxx
    -0.94
    Autoritní
    -0.91
    ConstraintMaker
    -0.91
     chofe
    -0.90
     HasFactory
    -0.90
     ujednoznacz
    -0.88
     uſe
    -0.88
     autorytatywna
    -0.87
     calendriers
    -0.86
    POSITIVE LOGITS
    ↵↵
    0.56
     A
    0.55
    0.52
     in
    0.52
     .
    0.51
    po
    0.48
    <eos>
    0.48
     bu
    0.47
    ,
    0.47
    -
    0.46
    Act Density 0.129%

    No Known Activations