INDEX
Explanations
articles and descriptors used to characterize individuals or roles
New Auto-Interp
Negative Logits
fn
-0.81
answer
-0.79
Edit
-0.79
uden
-0.78
tone
-0.76
events
-0.75
attacks
-0.72
advertisement
-0.71
encies
-0.71
EMA
-0.70
POSITIVE LOGITS
member
1.14
staunch
1.11
prolific
1.09
descendant
1.07
fixture
1.04
supporter
1.03
proud
1.03
lifelong
1.01
longtime
1.00
devout
1.00
Activations Density 0.106%