INDEX
Explanations
terms related to qualities or attributes
concepts related to size, performance, and various ideologies
New Auto-Interp
Negative Logits
enium
-0.65
arag
-0.64
renheit
-0.64
ean
-0.63
esi
-0.62
oka
-0.60
ropolis
-0.60
xi
-0.58
anian
-0.58
Cluster
-0.57
POSITIVE LOGITS
considerations
1.12
matters
0.96
dictates
0.91
constraints
0.90
mattered
0.89
differences
0.86
aside
0.85
specificity
0.82
preced
0.81
outweigh
0.81
Activations Density 0.477%