INDEX
Explanations
terms related to eligibility
New Auto-Interp
Negative Logits
↵
-0.52
<eos>
-0.51
Bod
-0.49
Rest
-0.48
Bo
-0.46
Sen
-0.46
professional
-0.46
ffichage
-0.46
knot
-0.45
pre
-0.45
POSITIVE LOGITS
himſelf
0.93
الحره
0.92
itſelf
0.92
Hauptartikel
0.89
كومونز
0.89
eligible
0.88
themſelves
0.86
myſelf
0.85
виправивши
0.85
CloseOperation
0.85
Activations Density 0.233%