INDEX
Explanations
references to societal structures and comparisons
New Auto-Interp
Negative Logits
Schäfer
-0.47
contrar
-0.45
occuper
-0.44
material
-0.43
pão
-0.43
bestaande
-0.42
zaś
-0.41
moral
-0.41
transQ
-0.41
moral
-0.40
POSITIVE LOGITS
similar
0.65
Similar
0.58
Similar
0.57
similares
0.56
similar
0.56
RenderAtEndOf
0.53
comparable
0.52
akin
0.52
ähnliche
0.50
схо
0.48
Activations Density 0.617%