INDEX
Explanations
- different entities or entities interacting with each other
New Auto-Interp
Negative Logits
lance
-0.55
lus
-0.52
lov
-0.52
Indust
-0.52
indust
-0.52
asta
-0.51
LER
-0.50
vich
-0.50
undown
-0.49
gow
-0.48
POSITIVE LOGITS
fold
0.80
hundred
0.78
thousand
0.78
dimensional
0.78
dozen
0.73
consecutive
0.73
thirds
0.73
pairs
0.73
feet
0.73
teen
0.71
Activations Density 14.029%