INDEX
Explanations
words related to induction, incorporation, and incorporation
terms related to induction and incorporation processes
New Auto-Interp
Negative Logits
ende
-0.67
Correspond
-0.65
Machina
-0.62
lihood
-0.60
guard
-0.58
Ĥİ
-0.58
rall
-0.56
Codex
-0.55
Confeder
-0.55
pport
-0.54
POSITIVE LOGITS
into
1.91
INTO
1.77
into
1.66
Into
1.65
onto
1.34
hetto
0.81
tion
0.78
forth
0.75
overboard
0.75
onto
0.75
Activations Density 0.371%