INDEX
Explanations
discussions around experience and the interplay between knowledge and competence
New Auto-Interp
Negative Logits
REALLY
-0.20
VERY
-0.17
WOW
-0.15
HUGE
-0.14
(@
-0.14
;↵↵
-0.14
unos
-0.14
;↵
-0.14
??
-0.14
BIG
-0.13
POSITIVE LOGITS
_
0.77
**
0.38
-_
0.37
*
0.34
!_
0.31
*_
0.30
._
0.27
_|
0.26
{\0.24
__
0.24
Activations Density 0.927%