INDEX
Explanations
phrases related to long-term impact or duration
references to long-term effects or implications
New Auto-Interp
Negative Logits
Scrib
-0.95
Compass
-0.94
Wiz
-0.85
GEAR
-0.82
Pok
-0.80
ADRA
-0.79
Dickinson
-0.75
Polo
-0.75
Solitaire
-0.75
ulhu
-0.74
POSITIVE LOGITS
term
1.30
distance
1.20
enough
1.18
haired
1.12
lasting
1.10
sighted
1.09
eyed
1.07
suff
1.07
tailed
1.06
circ
1.06
Activations Density 0.039%