INDEX
Explanations
statements with "has been" followed by verbs
the verb "been," indicating actions or states that have occurred over time
New Auto-Interp
Negative Logits
terday
-0.80
rones
-0.74
Awakens
-0.72
Relief
-0.69
ives
-0.69
âĦ¢:
-0.68
Must
-0.67
lies
-0.67
odder
-0.66
would
-0.66
POSITIVE LOGITS
subjected
0.97
likened
0.96
replaced
0.94
taken
0.91
shown
0.90
avering
0.89
deemed
0.87
criticized
0.86
seen
0.85
able
0.84
Activations Density 0.152%