INDEX
Explanations
phrases containing the word "fault"
references to faults or errors in various contexts
New Auto-Interp
Negative Logits
ago
-0.77
xy
-0.74
¥µ
-0.74
het
-0.74
sov
-0.73
erva
-0.72
atten
-0.71
rooms
-0.70
iolet
-0.69
rons
-0.69
POSITIVE LOGITS
fault
1.49
faults
1.17
Fault
1.05
forgiven
0.99
blame
0.88
excuse
0.76
blames
0.73
tolerant
0.72
forgiveness
0.70
tolerance
0.69
Activations Density 0.006%