INDEX
Explanations
phrases related to speaking or acting on behalf of someone else
terms related to representation and advocacy
New Auto-Interp
Negative Logits
uve
-0.71
ãĤ£
-0.70
inctions
-0.69
Topic
-0.68
hig
-0.65
binary
-0.64
uesday
-0.63
aucus
-0.60
acan
-0.60
Bul
-0.59
POSITIVE LOGITS
selves
0.87
steps
0.85
stretched
0.80
brethren
0.80
counterparts
0.79
Majesty
0.79
mortal
0.71
andering
0.70
sensibilities
0.68
cohorts
0.68
Activations Density 0.192%