INDEX
Explanations
phrases related to changes or modifications
references to changes in opinions, conditions, and situations
New Auto-Interp
Negative Logits
Magikarp
-0.72
ccoli
-0.67
Pear
-0.65
Dragon
-0.65
WAR
-0.63
Grab
-0.63
aturdays
-0.63
ulhu
-0.63
ngth
-0.62
Spoiler
-0.61
POSITIVE LOGITS
fortunes
0.84
radically
0.77
drastically
0.76
dial
0.75
tack
0.74
habits
0.74
tone
0.72
priorities
0.72
dramatically
0.71
clocks
0.71
Activations Density 0.222%