INDEX
Explanations
proper nouns of individuals prevalent in specific events or contexts
references to individuals or entities being mentioned together
New Auto-Interp
Negative Logits
Trend
-0.73
veyard
-0.70
artz
-0.68
blem
-0.64
slaught
-0.64
strang
-0.64
Warden
-0.64
Barrier
-0.62
hibition
-0.62
anyl
-0.62
POSITIVE LOGITS
vying
0.81
equally
0.79
congratulated
0.73
edged
0.72
equal
0.71
halves
0.71
respectively
0.70
sides
0.70
ocating
0.67
separately
0.67
Activations Density 0.092%