INDEX
Explanations
people's names, especially the name "Philip"
New Auto-Interp
Negative Logits
yrinth
-0.65
DERR
-0.63
LOAD
-0.63
eer
-0.59
bnb
-0.59
ipeg
-0.58
lly
-0.58
hips
-0.57
inventoryQuantity
-0.56
EntityItem
-0.56
POSITIVE LOGITS
Randolph
0.73
Morris
0.71
son
0.71
Seymour
0.68
anthrop
0.64
entric
0.63
Rivers
0.63
obar
0.62
istine
0.61
Pull
0.61
Activations Density 6.477%