INDEX
Explanations
phrases related to comparisons or distinctions
connections or comparisons between ideas or concepts
New Auto-Interp
Negative Logits
=(
-0.70
ILCS
-0.70
!".
-0.62
zers
-0.60
ãĢį
-0.60
.ãĢį
-0.59
((
-0.58
.–
-0.57
zona
-0.56
pointers
-0.56
POSITIVE LOGITS
ILA
0.62
onential
0.59
understatement
0.58
elegance
0.58
arious
0.55
perfect
0.53
diligence
0.53
otic
0.53
Responsibility
0.53
isite
0.52
Activations Density 0.387%