INDEX
Explanations
words related to founders or founding of organizations and companies
New Auto-Interp
Negative Logits
purpoſe
-0.80
Theſe
-0.77
ſtate
-0.72
pleaſure
-0.71
ſever
-0.70
faſt
-0.68
perfons
-0.68
uſed
-0.68
ARXIV
-0.67
Chriftian
-0.67
POSITIVE LOGITS
founder
0.78
dam
0.72
dam
0.68
damn
0.67
Founder
0.66
autorytatywna
0.65
IntoConstraints
0.65
jam
0.64
damn
0.60
GOT
0.60
Activations Density 0.095%