INDEX
    Explanations

    questions starting with "Why"

    New Auto-Interp
    Negative Logits
    <bos>
    -3.06
    -0.93
    /***
    
    -0.82
    
    
    -0.81
    /*!
    
    -0.73
     rehabilitate
    -0.67
    /**
    -0.66
    <?
    
    -0.65
    <?
    -0.61
    lateinit
    -0.60
    POSITIVE LOGITS
     bandung
    1.12
     eiffel
    1.09
     Manufact
    1.08
     why
    1.07
     WHY
    1.06
    why
    1.05
     swarovski
    1.05
     milano
    1.02
    WHY
    1.01
     beverly
    1.01
    Act Density 0.120%

    No Known Activations