INDEX
    Explanations

    statements where a citation is needed

    references to citations needed in texts

    New Auto-Interp
    Negative Logits
    milo
    -0.74
    wives
    -0.65
    mare
    -0.64
     thrott
    -0.63
    wife
    -0.59
    profits
    -0.58
     condos
    -0.58
    venge
    -0.54
    leep
    -0.54
     bragging
    -0.54
    POSITIVE LOGITS
    ]
    1.24
     ]
    1.20
    ][
    1.16
    ])
    1.15
    ][/
    1.13
    ]"
    1.11
    ]:
    1.11
    ]).
    1.11
     ].
    1.09
    ].
    1.08
    Act Density 0.014%

    No Known Activations