INDEX
Explanations
mentions of the name "Paul."
New Auto-Interp
Negative Logits
Anſ
-1.04
pleaſure
-1.01
preſent
-0.98
Bli
-0.98
Ibis
-0.96
Reſ
-0.89
ſy
-0.88
Bertie
-0.88
Monfieur
-0.87
iſt
-0.86
POSITIVE LOGITS
Paul
1.47
Paul
1.27
PAUL
1.12
PAUL
0.97
Paulson
0.96
paul
0.93
paul
0.93
Paulus
0.91
Paulina
0.85
Paula
0.77
Activations Density 0.047%