INDEX
Explanations
references to scientific studies or publications
New Auto-Interp
Negative Logits
purpoſe
-0.99
ſever
-0.87
Monfieur
-0.86
myſelf
-0.85
whoſe
-0.85
raiſ
-0.84
pleaſure
-0.83
diſt
-0.83
ſee
-0.81
juſt
-0.79
POSITIVE LOGITS
et
3.99
Et
2.17
Et
1.92
ET
1.90
etc
1.36
Etc
1.25
Etc
1.24
ett
1.13
tagHelperRunner
1.07
usw
1.03
Activations Density 0.096%