INDEX
Explanations
starting fresh small pivoting
New Auto-Interp
Negative Logits
grasp
1.36
instinct
1.26
delusion
1.26
for
1.24
folly
1.23
horror
1.21
nonsense
1.21
inexplicable
1.19
mentality
1.18
regardless
1.16
POSITIVE LOGITS
vuelve
1.80
különböző
1.79
giugno
1.78
informações
1.76
presentó
1.75
escolher
1.75
význam
1.73
destacó
1.73
způ
1.73
confirmó
1.73
Activations Density 0.100%