INDEX
    Explanations

    the word "way" with a high activation value

    New Auto-Interp
    Negative Logits
    usters
    -1.02
    uster
    -0.88
     livest
    -0.84
    ĸļ
    -0.79
    omore
    -0.73
    oppable
    -0.71
    asts
    -0.69
    oubted
    -0.69
    inately
    -0.68
    lict
    -0.67
    POSITIVE LOGITS
    fare
    1.27
    finding
    1.21
    ward
    1.19
    forward
    1.10
    point
    1.06
    finder
    0.90
    points
    0.89
     forward
    0.88
    bill
    0.81
    station
    0.77
    Act Density 1.161%

    No Known Activations