INDEX
Explanations
words related to belief or faith
the presence of variations of the word "beef."
New Auto-Interp
Negative Logits
)=(
-0.69
=-=-=-=-
-0.67
Passenger
-0.63
bear
-0.59
âĸ¬
-0.59
FTWARE
-0.59
âĶģ
-0.58
ply
-0.57
Poles
-0.57
git
-0.57
POSITIVE LOGITS
icient
1.21
inition
1.19
orce
1.13
riend
1.08
onso
1.08
ield
1.02
ornia
0.99
ault
0.98
aq
0.97
iciency
0.97
Activations Density 0.017%