INDEX
Explanations
phrases related to defending or defending against accusations
New Auto-Interp
Negative Logits
FORE
-0.72
hall
-0.72
hook
-0.63
explode
-0.61
rains
-0.60
çīĪ
-0.60
Recipe
-0.59
beam
-0.58
bleed
-0.58
way
-0.58
POSITIVE LOGITS
against
1.14
Against
1.07
against
0.94
vigorously
0.86
atively
0.82
iveness
0.79
defending
0.76
indef
0.73
ously
0.73
ively
0.70
Activations Density 0.088%