INDEX
    Explanations

    user input/model response structure

    New Auto-Interp
    Negative Logits
     envis
    2.65
     enlargement
    2.61
     VERS
    2.51
     diagrams
    2.50
     Penc
    2.50
     Joyce
    2.49
     strategies
    2.48
     cheque
    2.48
    2.45
     arrows
    2.44
    POSITIVE LOGITS
    7
    3.18
    6
    3.15
    3
    2.97
    8
    2.90
    5
    2.89
    4
    2.79
    0
    2.68
    9
    2.60
    2
    2.28
    1
    2.22
    Act Density 0.237%

    No Known Activations