INDEX
Explanations
words related to faults or problems
instances of the word "fault."
New Auto-Interp
Negative Logits
atos
-0.80
ISTER
-0.78
CHO
-0.74
eston
-0.71
het
-0.71
¥µ
-0.70
cheat
-0.69
gdala
-0.68
chedel
-0.68
xon
-0.67
POSITIVE LOGITS
lessly
1.23
ously
0.93
lessness
0.83
fault
0.80
less
0.79
Fault
0.78
faults
0.76
finding
0.74
forgiven
0.71
ridden
0.71
Activations Density 0.026%