INDEX
Explanations
phrases indicating a high degree or extent of something
expressions of generality or vagueness
New Auto-Interp
Negative Logits
ortmund
-0.72
20439
-0.68
TEXTURE
-0.67
DEBUG
-0.67
AMY
-0.66
Hart
-0.65
Bir
-0.65
GPU
-0.64
hawk
-0.64
odium
-0.63
POSITIVE LOGITS
nailed
0.94
everything
0.90
everywhere
0.89
boils
0.88
summed
0.87
everything
0.85
identical
0.83
worthless
0.82
sums
0.81
unchecked
0.78
Activations Density 0.054%