INDEX
Explanations
punctuation and formatting elements in the text
New Auto-Interp
Negative Logits
ŃĶ
-0.74
savage
-0.73
omics
-0.73
plet
-0.71
ought
-0.69
outraged
-0.68
instinct
-0.66
impression
-0.66
rallying
-0.66
morphed
-0.65
POSITIVE LOGITS
Alternatively
1.75
Lastly
1.73
Additionally
1.72
Conversely
1.70
Likewise
1.68
Therefore
1.65
Similarly
1.65
Otherwise
1.60
Furthermore
1.58
However
1.54
Activations Density 0.244%