INDEX
Explanations
commands or suggestions related to giving voice or agency to oneself or others
expressions of agency and empowerment regarding individual choices
New Auto-Interp
Negative Logits
enegger
-1.11
Downloadha
-0.76
ModLoader
-0.73
Annotations
-0.72
ALD
-0.66
interstitial
-0.65
HAM
-0.64
lia
-0.64
imilar
-0.62
Medal
-0.62
POSITIVE LOGITS
breathe
0.78
shine
0.76
YP
0.75
unic
0.74
expire
0.69
suffice
0.68
aside
0.67
orce
0.66
ghai
0.66
loose
0.65
Activations Density 0.081%