INDEX
Explanations
instances of punctuation marks, especially commas
New Auto-Interp
Negative Logits
Shinra
-0.66
exha
-0.65
Meter
-0.64
isolate
-0.61
bombing
-0.61
dome
-0.60
Rounds
-0.58
bay
-0.58
neighbourhood
-0.57
flank
-0.57
POSITIVE LOGITS
ï¸ı
1.18
âĢ
1.04
_-
1.02
sorry
0.91
cause
0.89
respond
0.89
Ïī
0.89
[/
0.88
в
0.87
where
0.87
Activations Density 0.146%