INDEX
Explanations
phrases related to providing assistance or intervention to those in need
references to the word "it" and related pronouns indicating necessity or need
New Auto-Interp
Negative Logits
idth
-0.63
Pirates
-0.62
Spoon
-0.61
aughed
-0.58
Panel
-0.57
Suzuki
-0.57
rotein
-0.55
UDP
-0.55
Sanchez
-0.55
Tau
-0.55
POSITIVE LOGITS
self
1.13
ourselves
1.01
chy
0.98
selves
0.98
oneself
0.95
yourself
0.88
herself
0.87
alian
0.87
anyway
0.84
myself
0.83
Activations Density 0.185%