INDEX
Explanations
mentions of people and their relationships, especially in the context of work or partnerships
mentions of individuals and their background or relationships
New Auto-Interp
Negative Logits
issu
-0.65
amacare
-0.63
uncertainties
-0.63
verage
-0.63
Transparency
-0.62
polarized
-0.62
itivity
-0.62
raq
-0.62
reiterate
-0.62
osphere
-0.61
POSITIVE LOGITS
çͰ
0.88
fame
0.81
brother
0.72
catentry
0.72
nephew
0.70
Fell
0.69
Earl
0.69
pian
0.68
76561
0.68
pseudonym
0.67
Activations Density 1.027%