INDEX
Explanations
years within text
references to specific years and studies
New Auto-Interp
Negative Logits
actionGroup
-0.82
ammers
-0.78
ands
-0.77
rats
-0.77
assies
-0.75
amines
-0.75
ences
-0.72
artifacts
-0.69
omever
-0.65
umers
-0.65
POSITIVE LOGITS
snapshot
0.87
statement
0.87
broch
0.85
memo
0.83
affidavit
0.82
Vanity
0.82
pamphlet
0.81
memorandum
0.79
appendix
0.79
bombshell
0.78
Activations Density 0.111%