INDEX
Explanations
words related to ranking or importance
words related to performance and evaluation
New Auto-Interp
Negative Logits
waivers
-0.72
Consent
-0.71
designation
-0.70
Dragons
-0.69
Generator
-0.66
Jackets
-0.65
Prevention
-0.64
Knights
-0.64
Waves
-0.63
Writers
-0.63
POSITIVE LOGITS
erous
1.18
istent
1.11
fficient
1.09
isable
1.09
istic
1.08
ulent
1.07
ensical
1.06
icky
1.03
ceptive
1.02
hesive
1.01
Activations Density 0.343%