INDEX
Explanations
terms related to organization and systems structure
New Auto-Interp
Negative Logits
,
-0.97
-
-0.92
.
-0.92
in
-0.85
a
-0.85
(
-0.81
-0.81
-0.81
the
-0.81
&
-0.79
POSITIVE LOGITS
незавершена
1.62
فريبيس
1.50
تقاوى
1.42
extAlignment
1.40
ſelf
1.39
متعلقه
1.36
Anſ
1.36
+#+#
1.33
ſelves
1.31
ſtate
1.30
Activations Density 1.182%