INDEX
Explanations
different forms of the word "upgrade" or "downgrade"
terms related to rating changes, particularly downgrades and upgrades
New Auto-Interp
Negative Logits
stack
-0.88
advertisement
-0.87
Dover
-0.71
vous
-0.71
Caption
-0.67
Acts
-0.67
Pa
-0.66
Cthulhu
-0.65
eye
-0.64
laws
-0.64
POSITIVE LOGITS
graded
1.50
downgrade
1.31
grading
1.26
upgraded
1.13
grades
1.05
xual
1.02
upgrades
0.98
upgrading
0.91
upgrade
0.90
grades
0.88
Activations Density 0.006%