INDEX
Explanations
fall victim, fall short, fall into
New Auto-Interp
Negative Logits
𝚍
0.70
𝙚
0.67
τή
0.66
ي
0.66
ously
0.63
不满
0.63
рил
0.61
ه
0.61
𝐝
0.61
𝚐
0.61
POSITIVE LOGITS
Falling
1.09
Falling
1.01
fall
0.96
Fall
0.96
falling
0.95
falls
0.92
fell
0.91
Falls
0.91
FALL
0.84
سقوط
0.80
Activations Density 0.029%