INDEX
Explanations
numbers followed by punctuation
New Auto-Interp
Negative Logits
Any
-0.49
Any
-0.43
luckily
-0.43
fortunately
-0.42
upon
-0.42
if
-0.41
upon
-0.41
thankfully
-0.41
Cualquier
-0.40
amongst
-0.40
POSITIVE LOGITS
In
1.59
In
1.09
وفي
1.07
În
0.96
On
0.91
ใน
0.91
וב
0.84
At
0.82
În
0.70
Pada
0.66
Activations Density 1.020%