INDEX
Explanations
references to being grounded and connected to reality
New Auto-Interp
Negative Logits
Å¡
-0.15
arger
-0.15
loat
-0.14
grö
-0.14
upos
-0.14
inu
-0.13
uilt
-0.13
.animations
-0.13
ë¹
-0.13
اÙĦخاÙħسة
-0.13
POSITIVE LOGITS
reality
0.35
real
0.32
Reality
0.31
realism
0.30
REAL
0.30
Reality
0.30
-real
0.29
realities
0.28
practical
0.28
real
0.27
Activations Density 0.190%