INDEX
Explanations
phrases indicating uncertainty or possibilities for future events
phrases indicating ongoing conditions or states that "remain" unchanged
New Auto-Interp
Negative Logits
catentry
-0.76
erity
-0.72
wcsstore
-0.69
Surprise
-0.67
Mot
-0.65
BAT
-0.62
Eclipse
-0.61
ATT
-0.60
elf
-0.59
dress
-0.58
POSITIVE LOGITS
intact
0.89
indefinitely
0.87
haunt
0.83
limbo
0.82
untouched
0.81
unchanged
0.79
viable
0.78
lishes
0.78
unaffected
0.76
forever
0.74
Activations Density 0.144%