INDEX
Explanations
proper names of individuals
the pronoun "who" in various contexts
New Auto-Interp
Negative Logits
Georg
-0.64
Bas
-0.64
ulp
-0.61
Bound
-0.60
Guys
-0.60
reach
-0.60
IND
-0.59
SE
-0.58
NRS
-0.57
definition
-0.57
POSITIVE LOGITS
oversaw
1.01
famously
1.01
oversees
1.00
died
0.99
owns
0.98
incidentally
0.97
resided
0.97
overcame
0.96
resides
0.95
suffers
0.95
Activations Density 0.100%