INDEX
Explanations
What's Wrong, What's Going, What's up
New Auto-Interp
Negative Logits
indeed
0.83
os
0.79
l
0.77
as
0.73
throughout
0.72
conform
0.71
to
0.70
ோர
0.70
ताई
0.69
forse
0.68
POSITIVE LOGITS
been
1.61
been
1.48
Been
1.35
BEEN
1.33
gotta
1.06
Been
1.02
Gonna
1.00
været
0.97
gonna
0.96
sido
0.93
Activations Density 0.002%