INDEX
Explanations
references to people and their relationships
New Auto-Interp
Negative Logits
cross
-0.41
Ancient
-0.40
mstyle
-0.39
Solidar
-0.39
Instant
-0.39
Trọng
-0.38
Historic
-0.38
Historic
-0.37
indrical
-0.37
the
-0.37
POSITIVE LOGITS
그녀
0.85
them
0.82
them
0.79
herself
0.76
juos
0.76
him
0.76
she
0.76
She
0.75
THEM
0.75
Them
0.75
Activations Density 0.187%