INDEX
Explanations
phrases related to small or minimal entities or activities
keywords related to "micro" concepts and phenomena
New Auto-Interp
Negative Logits
Wem
-0.69
Moreno
-0.68
ĪĴ
-0.65
ktop
-0.64
Sham
-0.63
Wol
-0.63
Zimmer
-0.62
Petty
-0.62
Pole
-0.61
Hutchinson
-0.61
POSITIVE LOGITS
batch
0.77
abytes
0.76
oute
0.75
abulary
0.74
γ
0.74
lections
0.74
rules
0.72
kat
0.72
tiny
0.72
seconds
0.72
Activations Density 0.077%