INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.13
    1:0.05
    2:0.05
    3:0.07
    4:0.05
    5:0.07
    6:0.20
    7:0.02
    8:0.10
    9:0.07
    10:0.07
    11:0.06
    Negative Logits
     Bj
    -1.34
     Bust
    -1.32
    ć
    -1.31
     Marin
    -1.29
     cavalry
    -1.27
    halla
    -1.26
     assignment
    -1.24
    pload
    -1.24
     Boise
    -1.19
     Thrones
    -1.17
    POSITIVE LOGITS
    pecially
    1.93
    Attempts
    1.66
    Sadly
    1.59
    However
    1.58
    Moreover
    1.55
    USE
    1.53
    Therefore
    1.52
    Introdu
    1.51
    heat
    1.51
    Features
    1.49
    Act Density 0.003%

    No Known Activations