INDEX
Explanations
phrases related to reputation or emerging as a significant entity
phrases indicating reputation or status
New Auto-Interp
Negative Logits
OTAL
-0.71
romy
-0.69
ength
-0.67
rost
-0.66
raq
-0.65
ongyang
-0.64
rent
-0.64
tails
-0.64
rone
-0.63
cloth
-0.62
POSITIVE LOGITS
pires
0.95
synonymous
0.90
well
0.89
pired
0.89
embod
0.82
unbeat
0.81
invincible
0.80
viable
0.78
opposed
0.77
adept
0.77
Activations Density 0.162%