INDEX
    Explanations

    phrases indicating communication or information exchange

    New Auto-Interp
    Head Attr Weights
    0:0.09
    1:0.02
    2:0.06
    3:0.05
    4:0.04
    5:0.04
    6:0.26
    7:0.04
    8:0.06
    9:0.23
    10:0.03
    11:0.03
    Negative Logits
     Metall
    -3.65
     robber
    -3.56
     cyan
    -3.48
     Nurs
    -3.47
    Barn
    -3.45
     Barnes
    -3.43
     chees
    -3.43
     Tes
    -3.41
     Sed
    -3.38
     Cena
    -3.37
    POSITIVE LOGITS
     FP
    9.99
    FP
    8.95
    fp
    6.95
    FK
    3.96
     NF
    3.89
    Flo
    3.83
    FI
    3.76
     POV
    3.65
    TP
    3.64
     TP
    3.61
    Act Density 0.001%

    No Known Activations