INDEX
Explanations
references to individuals or people in various contexts
New Auto-Interp
Negative Logits
flat
-0.57
cu
-0.53
di
-0.53
st
-0.52
pr
-0.51
collection
-0.50
sp
-0.50
het
-0.50
fixed
-0.49
trade
-0.49
POSITIVE LOGITS
someone
2.34
anyone
2.30
someone
2.28
anyone
2.28
alguien
2.19
Someone
2.17
Anyone
2.03
anybody
2.03
Someone
2.03
Anyone
1.99
Activations Density 0.290%