INDEX
Explanations
phrases indicating the presence of specific contexts
New Auto-Interp
Negative Logits
africano
-0.75
africaine
-0.68
extérieure
-0.68
africain
-0.66
chimique
-0.66
RegressionTest
-0.65
virginity
-0.65
preuves
-0.63
negroes
-0.63
antaranya
-0.63
POSITIVE LOGITS
into
0.90
a
0.75
INTO
0.69
Into
0.69
queryInterface
0.68
the
0.64
Bronnen
0.62
position
0.62
setu
0.61
Into
0.61
Activations Density 0.049%