INDEX
    Explanations

    prepositions used in various contexts

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.01
    2:0.10
    3:0.05
    4:0.06
    5:0.02
    6:0.22
    7:0.28
    8:0.03
    9:0.03
    10:0.06
    11:0.07
    Negative Logits
    agall
    -2.02
     humility
    -1.71
    idity
    -1.50
     simplicity
    -1.50
    ¯
    -1.44
     disclaim
    -1.43
     patience
    -1.42
     diplomacy
    -1.42
    bara
    -1.41
     realism
    -1.40
    POSITIVE LOGITS
    stats
    1.48
    agues
    1.40
     Statistical
    1.40
    ogyn
    1.39
    REE
    1.34
    scribed
    1.34
     athlet
    1.31
    hack
    1.29
    existent
    1.26
     Rak
    1.25
    Act Density 0.003%

    No Known Activations