INDEX
Explanations
words related to personal achievements and professional experiences
New Auto-Interp
Negative Logits
.
-0.51
deras
-0.48
γε
-0.44
kema
-0.43
Erfindung
-0.41
toiminta
-0.41
şeyler
-0.40
vaikka
-0.40
!
-0.40
primaryStage
-0.39
POSITIVE LOGITS
"},
0.89
]--;
0.87
StructEnd
0.87
]),
0.85
"),
0.85
")));
0.84
'),
0.84
)");
0.82
"):
0.82
"],
0.81
Activations Density 0.232%