INDEX
Explanations
instances of the word "and"
New Auto-Interp
Negative Logits
Stiles
-0.66
Искәрмәләр
-0.60
Socrates
-0.59
FAL
-0.59
medel
-0.59
Divina
-0.58
Lio
-0.58
doGet
-0.58
Ceramby
-0.57
Artem
-0.57
POSITIVE LOGITS
Và
1.05
And
1.03
And
0.96
AND
0.94
AND
0.90
Και
0.81
그리고
0.80
\&
0.79
}&
0.78
+"&
0.78
Activations Density 0.186%