INDEX
Explanations
phrases referring to different types or degrees of some quality or feature
New Auto-Interp
Negative Logits
aughs
-0.98
Lys
-0.95
oons
-0.94
sbm
-0.91
Emin
-0.89
Pigs
-0.88
Purs
-0.88
Files
-0.87
ires
-0.87
amples
-0.86
POSITIVE LOGITS
semblance
1.14
intermediary
1.06
unspecified
0.98
whatsoever
0.98
resembling
0.95
halfway
0.95
insula
0.94
meaningful
0.94
entity
0.94
else
0.92
Activations Density 0.888%