INDEX
    Explanations

    instances where something is being recommended or proposed

    New Auto-Interp
    Negative Logits
    bers
    -0.86
    brance
    -0.83
    erning
    -0.81
    vation
    -0.77
    ership
    -0.77
    alde
    -0.77
    tin
    -0.77
    aos
    -0.76
    ctors
    -0.74
    reth
    -0.74
    POSITIVE LOGITS
     suggestions
    0.84
     Parenthood
    0.76
    Attempts
    0.72
     hints
    0.72
     suggested
    0.72
     introdu
    0.71
     unanimously
    0.70
     easing
    0.69
     hint
    0.68
    Magikarp
    0.68
    Act Density 0.035%

    No Known Activations