INDEX
Explanations
informational, educational, ethical purposes
New Auto-Interp
Negative Logits
weird
0.98
stuff
0.92
veggies
0.92
regs
0.89
pics
0.86
input
0.84
thing
0.81
tweaks
0.80
greats
0.79
ads
0.79
POSITIVE LOGITS
educational
1.23
educational
1.19
informational
1.18
информа
1.11
Educational
1.08
शैक्षिक
1.02
informative
1.02
ethical
0.96
Inform
0.95
Educational
0.94
Activations Density 0.410%