INDEX
Explanations
references to gratitude and appreciation
New Auto-Interp
Negative Logits
moschino
-0.89
Himo
-0.84
########.
-0.84
Hochspringen
-0.82
Seeder
-0.81
الحياه
-0.79
Surname
-0.78
margiela
-0.77
imagui
-0.77
<unused71>
-0.77
POSITIVE LOGITS
<eos>
0.84
↵↵
0.73
The
0.72
0.66
↵
0.60
...
0.60
↵↵↵
0.56
…
0.55
A
0.55
Do
0.55
Activations Density 0.308%