INDEX
Explanations
the word "break" or related phrases
references to the word "break."
New Auto-Interp
Negative Logits
uni
-0.77
Poster
-0.70
Shap
-0.69
ongh
-0.69
anooga
-0.68
landish
-0.68
ushi
-0.67
judicial
-0.66
emed
-0.66
imir
-0.66
POSITIVE LOGITS
break
3.67
breaks
2.57
break
2.45
Break
2.35
Break
2.12
breaking
1.87
breaks
1.78
broke
1.76
breakdown
1.70
breakup
1.60
Activations Density 0.011%