INDEX
Explanations
instances of the name "Paul" with relatively high activation
mentions of the name "Paul."
New Auto-Interp
Negative Logits
lodge
-0.74
âĸ¬
-0.70
cyl
-0.67
Elven
-0.66
Pebble
-0.64
engineering
-0.62
âĸ¬âĸ¬
-0.61
CONTROL
-0.60
Limit
-0.60
PDATE
-0.60
POSITIVE LOGITS
okes
0.99
McCartney
0.98
sburg
0.97
ine
0.97
son
0.95
ozo
0.94
ains
0.94
sen
0.92
sonian
0.87
Krugman
0.87
Activations Density 0.015%