INDEX
Explanations
speculative biology and influence
New Auto-Interp
Negative Logits
as
0.63
in
0.62
en
0.61
can
0.61
ut
0.59
add
0.58
A
0.58
one
0.58
il
0.57
top
0.56
POSITIVE LOGITS
paralysie
0.61
voisines
0.55
employability
0.54
یسر
0.54
embarazo
0.54
melhorar
0.52
وترات
0.52
pattes
0.51
---’
0.50
marihuana
0.50
Activations Density 0.001%