INDEX
Explanations
assertive language related to firmness or strong convictions
New Auto-Interp
Negative Logits
;;;;;;;;;;;;
-0.83
querade
-0.78
̶
-0.76
vernment
-0.71
Emin
-0.70
NF
-0.70
FactoryReloaded
-0.69
livest
-0.69
Lv
-0.66
;;;;;;;;
-0.66
POSITIVE LOGITS
ament
1.25
ness
1.14
nesses
1.01
believer
1.00
footing
0.97
grounding
0.91
hearted
0.89
ures
0.87
ty
0.87
est
0.86
Activations Density 0.014%