INDEX
Explanations
references to relationships and social interactions
New Auto-Interp
Negative Logits
T
-0.40
G
-0.39
?
-0.39
@
-0.39
rainha
-0.39
#
-0.38
his
-0.38
kema
-0.37
man
-0.36
hija
-0.35
POSITIVE LOGITS
RenderAtEndOf
0.96
setVerticalGroup
0.94
―――――
0.93
$_"
0.90
themſelves
0.87
ſind
0.86
ſeveral
0.84
ſelves
0.84
iſt
0.84
$_(
0.84
Activations Density 0.412%