INDEX
Explanations
words related to official recommendations or approvals
mentions of endorsements
New Auto-Interp
Negative Logits
STEM
-0.71
arters
-0.71
ester
-0.68
Smithsonian
-0.67
rooms
-0.66
isen
-0.65
Rapt
-0.64
Pom
-0.64
Hen
-0.64
Hamp
-0.63
POSITIVE LOGITS
endorsements
1.57
endorsement
1.56
endorse
1.26
endorsing
1.22
endors
1.22
endorsed
1.20
indal
0.87
loyalty
0.83
eering
0.83
recommendation
0.80
Activations Density 0.008%