INDEX
Explanations
words related to strong or forceful actions, possibly violence or impact
references to significant achievements or performances in sports
New Auto-Interp
Negative Logits
conve
-0.72
privacy
-0.69
mirrors
-0.65
cond
-0.65
Provision
-0.64
permitting
-0.64
decor
-0.63
Celestial
-0.63
delegates
-0.62
detailing
-0.61
POSITIVE LOGITS
hit
4.32
Hit
1.82
hitting
1.77
hit
1.66
Hit
1.47
Hits
1.46
HIT
1.36
hits
1.32
shit
1.23
hig
1.19
Activations Density 0.016%