INDEX
Explanations
the followed by abstract concepts
New Auto-Interp
Negative Logits
0
0.59
1
0.54
8
0.50
6
0.48
(
0.48
:
0.46
=
0.45
'
0.44
4
0.43
+
0.43
POSITIVE LOGITS
intricacies
0.64
complexities
0.63
süreç
0.53
allerlei
0.52
microcosm
0.52
ecosystems
0.52
vielfält
0.51
environments
0.50
nuances
0.50
lifecycle
0.49
Activations Density 0.078%