INDEX
Explanations
words related to importance or emphasis
terms related to major roles or functions
New Auto-Interp
Negative Logits
erved
-0.92
ceans
-0.80
erves
-0.79
ork
-0.78
paio
-0.77
LV
-0.75
ossession
-0.72
apons
-0.70
aping
-0.69
utics
-0.68
POSITIVE LOGITS
reason
1.12
culprit
1.09
stay
1.03
source
1.00
contributor
0.96
proponent
0.94
distinguishing
0.93
reasons
0.90
obstacle
0.90
difference
0.86
Activations Density 0.092%