INDEX
Explanations
mentions of actions related to saving or protection
instances of the word "save" and its variations related to rescue or preservation
New Auto-Interp
Negative Logits
Intern
-0.67
yond
-0.66
kt
-0.66
VP
-0.63
quartered
-0.63
FG
-0.61
pora
-0.60
sclerosis
-0.59
Rush
-0.58
KER
-0.58
POSITIVE LOGITS
Save
0.93
Save
0.79
Saving
0.77
saving
0.75
Sanctuary
0.74
Lives
0.74
anza
0.73
luc
0.70
usc
0.69
lives
0.66
Activations Density 0.026%