INDEX
Explanations
different types of things
mentions of different categories or classifications
New Auto-Interp
Negative Logits
rame
-0.69
Led
-0.67
pire
-0.66
wark
-0.65
charge
-0.63
wash
-0.63
lest
-0.62
oyer
-0.61
endum
-0.61
imate
-0.61
POSITIVE LOGITS
types
3.59
types
2.70
kinds
2.61
type
2.51
Types
2.49
Types
2.28
type
1.90
sorts
1.87
Type
1.72
varieties
1.67
Activations Density 0.013%