INDEX
Explanations
phrases related to making demands or requests
demands for accountability or clarification from authorities
New Auto-Interp
Negative Logits
ohyd
-0.74
roach
-0.66
pha
-0.61
lins
-0.61
oop
-0.60
uyomi
-0.59
hops
-0.59
ipel
-0.59
Tracks
-0.59
fox
-0.58
POSITIVE LOGITS
urgently
1.04
clarification
0.95
forgiveness
0.90
attention
0.88
pardon
0.87
urgent
0.80
revenge
0.77
permission
0.77
cancellation
0.77
reconsider
0.75
Activations Density 0.251%