INDEX
Explanations
phrases that discuss potential outcomes or scenarios
New Auto-Interp
Negative Logits
igon
-0.71
Saffron
-0.67
Deutscher
-0.65
ucu
-0.64
Iranian
-0.63
Dumas
-0.63
Iranian
-0.63
ה
-0.62
Damon
-0.62
Holmes
-0.61
POSITIVE LOGITS
Possibility
1.38
possibility
1.35
possibility
1.29
Possibility
1.21
posibilidad
1.16
possibilidade
1.13
possibilité
1.11
possibilities
1.08
Possibilities
1.05
possibilità
1.04
Activations Density 0.096%