INDEX
    Explanations

    mentions of social media accounts or handles

    New Auto-Interp
    Head Attr Weights
    0:0.09
    1:0.03
    2:0.06
    3:0.06
    4:0.09
    5:0.07
    6:0.17
    7:0.10
    8:0.08
    9:0.11
    10:0.04
    11:0.06
    Negative Logits
     Mike
    -4.16
    Mike
    -3.92
     Manny
    -3.89
     Michael
    -3.76
     MJ
    -3.72
     Saul
    -3.72
    Michael
    -3.56
     Marty
    -3.25
     Pete
    -3.25
     Marc
    -3.18
    POSITIVE LOGITS
     bottleneck
    3.02
     barracks
    2.99
    ADRA
    2.82
     plantations
    2.72
     biodiversity
    2.72
     dracon
    2.71
     villages
    2.71
     exting
    2.70
     forests
    2.70
     genocide
    2.69
    Act Density 0.000%

    No Known Activations