INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Head Attr Weights
    0:0.08
    1:0.06
    2:0.10
    3:0.09
    4:0.08
    5:0.08
    6:0.07
    7:0.08
    8:0.08
    9:0.08
    10:0.08
    11:0.08
    Negative Logits
    iframe
    -1.68
    ambo
    -1.65
     Elon
    -1.63
    items
    -1.63
    Christmas
    -1.59
    appiness
    -1.57
     Jeep
    -1.53
    Veh
    -1.52
    Things
    -1.48
    leck
    -1.47
    POSITIVE LOGITS
     metabol
    1.74
     scientifically
    1.74
     biologically
    1.71
     bacter
    1.56
     Forge
    1.54
     germ
    1.53
    abase
    1.51
     experimenting
    1.51
     experimental
    1.50
     slic
    1.47
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.