INDEX
Explanations
the word "reason" and related phrases indicating contemplation or justification
New Auto-Interp
Negative Logits
sbm
-0.81
iola
-0.68
CVE
-0.67
ForgeModLoader
-0.65
IGH
-0.64
Article
-0.63
Buy
-0.63
natureconservancy
-0.62
fman
-0.61
FLAG
-0.60
POSITIVE LOGITS
somew
1.42
somewhere
0.98
else
0.90
somet
0.83
*/(
0.81
unlucky
0.79
forgot
0.79
someday
0.77
awfully
0.76
somehow
0.76
Activations Density 0.075%