INDEX
Explanations
references to the name "Phil" and related terms indicating philanthropy
New Auto-Interp
Negative Logits
uala
-0.19
rous
-0.18
yms
-0.15
ionales
-0.15
elta
-0.15
ception
-0.15
ety
-0.14
esta
-0.14
ette
-0.14
LastError
-0.14
POSITIVE LOGITS
ipp
0.33
ippi
0.26
ippines
0.26
omen
0.26
osopher
0.25
osoph
0.24
istine
0.24
ipe
0.23
adelphia
0.22
andering
0.22
Activations Density 0.007%