INDEX
Explanations
terms related to cause and effect or the results of actions
terminology related to various types of effects and their impacts
New Auto-Interp
Negative Logits
pigeon
-0.77
mbuds
-0.69
Timber
-0.68
Alto
-0.67
mint
-0.63
Dud
-0.62
cin
-0.62
Patriarch
-0.61
Methodist
-0.60
antz
-0.59
POSITIVE LOGITS
iveness
1.27
uated
1.19
ual
1.15
uating
1.06
ively
1.05
ually
1.04
uel
1.01
bringer
1.00
uation
0.99
uality
0.98
Activations Density 0.051%