INDEX
Explanations
phrases related to taking action or making a change in response to a situation
phrases suggesting action or concern regarding an issue
New Auto-Interp
Negative Logits
Constructed
-0.71
Died
-0.70
Writ
-0.64
emies
-0.64
omet
-0.64
Creat
-0.64
told
-0.64
confessed
-0.63
authored
-0.62
iets
-0.62
POSITIVE LOGITS
erous
0.83
administr
0.75
isance
0.72
arios
0.71
redist
0.69
behalf
0.68
halfway
0.68
improving
0.65
ILLE
0.65
injustice
0.65
Activations Density 0.049%