INDEX
Explanations
places where things are being arranged or done
New Auto-Interp
Negative Logits
so
-1.54
<bos>
-1.12
so
-1.03
So
-1.02
So
-1.02
SO
-0.85
så
-0.73
כך
-0.69
soo
-0.69
così
-0.69
POSITIVE LOGITS
فريبيس
0.77
Cæsar
0.73
myſelf
0.66
AlterField
0.65
bâtiments
0.62
ovács
0.59
Monfieur
0.59
EconPapers
0.57
Pernambuco
0.57
Mero
0.57
Activations Density 5.225%