INDEX
Explanations
references to faith and its associated concepts
New Auto-Interp
Negative Logits
ETT
-0.77
rrr
-0.69
ette
-0.69
Mullins
-0.69
Eugen
-0.69
huwa
-0.68
</b>
-0.67
Clarke
-0.65
mence
-0.65
Dov
-0.63
POSITIVE LOGITS
FAITH
1.65
faith
1.52
Faith
1.50
faith
1.50
Faith
1.45
faiths
1.14
Faithful
0.98
faithfulness
0.95
faithful
0.93
ویکیپدیا
0.90
Activations Density 0.115%