INDEX
Explanations
Insurance
The neuron consistently detects occurrences of the word “insurance” (in any context).
New Auto-Interp
Negative Logits
Walk
-0.07
Path
-0.07
ADHD
-0.07
point
-0.07
Dost
-0.07
彦
-0.07
Path
-0.07
(team
-0.07
Mitarbeiter
-0.07
leaflet
-0.07
POSITIVE LOGITS
Insurance
0.13
insurance
0.11
Insurance
0.10
UN
0.08
postage
0.07
ison
0.07
)
0.07
ongan
0.07
страх
0.07
Ins
0.07
Activations Density 0.009%