INDEX
Explanations
phrases indicating consistency in research or findings
New Auto-Interp
Negative Logits
secuencia
-0.46
gancho
-0.44
idées
-0.44
suena
-0.44
caminhão
-0.44
nación
-0.44
pleaſure
-0.43
competición
-0.43
Infór
-0.41
idéia
-0.41
POSITIVE LOGITS
consistency
1.03
Consistency
1.00
consistent
0.98
consistent
0.96
consist
0.94
consistency
0.93
Consistency
0.92
Consistent
0.88
Consistent
0.82
inconsistent
0.79
Activations Density 0.441%