INDEX
Explanations
phrases related to rules, instructions or requirements
modal verbs and phrases related to requirements or capabilities
New Auto-Interp
Negative Logits
watching
-0.91
watching
-0.83
Watching
-0.82
checking
-0.82
mindful
-0.79
overseeing
-0.79
thinking
-0.75
reviewing
-0.75
forcing
-0.75
debating
-0.74
POSITIVE LOGITS
contain
1.52
undergo
1.33
withstand
1.25
belong
1.25
survive
1.22
originate
1.22
emit
1.20
conform
1.20
behave
1.18
exist
1.16
Activations Density 0.435%