INDEX
Explanations
titles of books, podcasts, or other media
references to significant societal concepts or political movements
New Auto-Interp
Negative Logits
theless
-0.68
etheless
-0.66
locally
-0.66
overall
-0.65
forestry
-0.64
urgent
-0.64
equipment
-0.60
internationally
-0.60
AMI
-0.60
indications
-0.60
POSITIVE LOGITS
Problem
1.12
Principle
1.01
Debate
0.99
Syndrome
0.94
Manifest
0.93
Argument
0.92
Trilogy
0.92
fallacy
0.91
Massacre
0.91
ocalypse
0.91
Activations Density 0.643%