INDEX
Explanations
negatively framed or critical comments
negative assessments of experiences or qualities
New Auto-Interp
Negative Logits
GEAR
-0.82
Versus
-0.76
Reloaded
-0.69
File
-0.67
Rivals
-0.66
induction
-0.66
Downing
-0.66
Theft
-0.66
Pieces
-0.66
Caller
-0.65
POSITIVE LOGITS
famous
1.35
existent
1.32
great
1.25
named
1.22
popular
1.21
educated
1.20
successful
1.20
powerful
1.19
biased
1.19
expensive
1.18
Activations Density 0.035%