INDEX
Explanations
phrases related to negative events or situations
words and concepts related to direction and structure
New Auto-Interp
Negative Logits
Height
-0.48
Thing
-0.46
Month
-0.44
Qual
-0.44
Me
-0.43
Contribut
-0.43
Moving
-0.42
Acqu
-0.42
Corridor
-0.42
Rank
-0.42
POSITIVE LOGITS
lly
0.52
minist
0.49
gery
0.49
geoning
0.48
xual
0.47
perjury
0.47
parchment
0.45
puff
0.44
imity
0.43
arial
0.43
Activations Density 0.657%