INDEX
Explanations
self-referential actions and states
New Auto-Interp
Negative Logits
ایش
0.54
하신
0.52
జాగ్ర
0.51
manualmente
0.50
鵃
0.48
skilful
0.48
costruire
0.48
obter
0.46
LAGAB
0.46
بناء
0.45
POSITIVE LOGITS
itself
0.75
its
0.62
Its
0.55
autonomously
0.53
خودش
0.50
自身
0.49
function
0.48
delivering
0.48
Its
0.47
自动
0.46
Activations Density 0.743%