INDEX
Explanations
phrases related to actions of restoration or reversal
occurrences of the word "restore" and related terms indicating a return to a previous state or condition
New Auto-Interp
Negative Logits
rics
-0.93
QB
-0.85
chens
-0.84
kers
-0.82
Cola
-0.80
gerald
-0.78
onne
-0.75
edin
-0.74
ks
-0.73
wer
-0.73
POSITIVE LOGITS
havoc
0.83
itial
0.82
restore
0.82
shire
0.81
vig
0.78
imar
0.76
harmony
0.76
restoration
0.76
restoring
0.74
owship
0.73
Activations Density 0.791%