INDEX
Explanations
references to specific people's roles or positions in various contexts
New Auto-Interp
Negative Logits
Boca
-0.16
Taco
-0.15
buquerque
-0.15
Ohio
-0.14
riere
-0.14
taco
-0.14
iks
-0.14
ObjectContext
-0.14
apas
-0.14
Foo
-0.13
POSITIVE LOGITS
Anne
0.39
Anne
0.33
Gilbert
0.30
anne
0.28
Montgomery
0.26
anne
0.26
Mar
0.20
Av
0.20
orphan
0.20
Green
0.20
Activations Density 0.011%