INDEX
Explanations
significant nouns and specific variables in contexts related to events or categories
New Auto-Interp
Negative Logits
V
-0.22
V
-0.20
K
-0.17
.V
-0.16
652
-0.16
µ
-0.15
Super
-0.15
v
-0.15
dep
-0.15
super
-0.15
POSITIVE LOGITS
olini
0.17
xes
0.17
Hamilton
0.15
ASTER
0.15
anten
0.14
aster
0.14
BH
0.14
Ham
0.14
ham
0.14
°
0.13
Activations Density 0.056%