INDEX
Explanations
references to knowledge and understanding in various contexts
New Auto-Interp
Negative Logits
StructEnd
-0.58
AndEndTag
-0.56
WindowConstants
-0.56
Walkover
-0.56
SBATCH
-0.55
dymyr
-0.55
neko
-0.53
insegna
-0.52
jelent
-0.52
لاة
-0.51
POSITIVE LOGITS
base
1.01
base
0.96
Knowledge
0.86
Knowledge
0.85
Base
0.83
gained
0.83
knowledge
0.83
knowledge
0.82
bases
0.82
bases
0.81
Activations Density 0.088%