INDEX
Explanations
mentions of the word "manifesto"
references to manifestos
New Auto-Interp
Negative Logits
avery
-0.71
pei
-0.69
phthal
-0.67
paying
-0.65
aina
-0.65
Lago
-0.63
skinned
-0.61
Il
-0.60
AE
-0.60
ahn
-0.59
POSITIVE LOGITS
manifesto
1.43
eering
0.91
eers
0.87
declaration
0.86
eer
0.83
Manifest
0.80
declarations
0.77
ilogy
0.77
slogan
0.75
blueprint
0.75
Activations Density 0.008%