INDEX
Explanations
phrases related to judgment or criticism
conjunctions and phrases indicating more complex or compound statements
New Auto-Interp
Negative Logits
Picks
-0.69
Doodle
-0.66
itivity
-0.66
activity
-0.65
anza
-0.65
haven
-0.63
Catch
-0.62
TF
-0.62
Skip
-0.61
ny
-0.60
POSITIVE LOGITS
interrogated
1.15
executed
1.08
photographed
1.06
transported
1.06
punished
1.06
repaired
1.04
rewarded
1.04
evaluated
1.04
prosecuted
1.04
reused
1.03
Activations Density 0.268%