INDEX
    Explanations

    references to the middle layer or middle-aged concepts

    New Auto-Interp
    Negative Logits
     autorytatywna
    -0.49
     ſy
    -0.47
     matel
    -0.46
     Majefty
    -0.45
     pouvoit
    -0.45
    étoit
    -0.45
     Gouver
    -0.44
     suspen
    -0.43
     ſol
    -0.43
     pleaſure
    -0.43
    POSITIVE LOGITS
    0.76
     middle
    0.75
     中
    0.73
     closest
    0.71
     Middle
    0.68
    middle
    0.66
    Middle
    0.65
    MIDDLE
    0.65
     tengah
    0.65
     reason
    0.63
    Act Density 1.692%

    No Known Activations