INDEX
Explanations
words related to causality, particularly indicating what propels or triggers certain actions or events
words that indicate causation or influence
New Auto-Interp
Negative Logits
IFT
-0.69
urai
-0.66
Blackwell
-0.58
bral
-0.56
Kinnikuman
-0.56
oos
-0.55
çĦ
-0.55
Haku
-0.54
ift
-0.54
Recon
-0.54
POSITIVE LOGITS
by
1.39
by
1.27
By
1.04
BY
0.97
By
0.94
partly
0.92
principally
0.92
BY
0.89
chiefly
0.86
solely
0.85
Activations Density 0.209%