INDEX
Explanations
phrases containing the word "made"
the term "man-made" in various contexts
New Auto-Interp
Negative Logits
IRE
-0.78
AMS
-0.77
VERTISEMENT
-0.73
OUGH
-0.71
hus
-0.71
ECTION
-0.71
assis
-0.70
":["
-0.70
)].
-0.70
FIRE
-0.68
POSITIVE LOGITS
Fake
0.80
excuses
0.78
lda
0.68
userc
0.65
adjust
0.65
interventions
0.63
conspir
0.63
discoveries
0.62
excuse
0.62
artifacts
0.62
Activations Density 0.028%