INDEX
Explanations
The neuron is looking for acronyms consisting of the letters "TH" followed by a number
occurrences of the acronym "TH" and its variations in different contexts
New Auto-Interp
Negative Logits
ãĤŃ
-0.71
ãĤ±
-0.69
detail
-0.69
assetsadobe
-0.67
cloth
-0.64
mens
-0.64
ãĤ¹ãĥĪ
-0.63
Canaver
-0.63
atories
-0.62
enda
-0.62
POSITIVE LOGITS
ulhu
1.06
irteen
0.93
ttp
0.92
IRD
0.91
urst
0.86
OUGH
0.86
OTAL
0.84
orne
0.83
omas
0.81
TH
0.81
Activations Density 0.008%