INDEX
Explanations
This neuron detects occurrences of the word “writ” (as in “writ of …”) in legal documents.
New Auto-Interp
Negative Logits
('.')↵-0.06
videoer
-0.06
Levels
-0.06
метою
-0.06
uptake
-0.06
-0.06
muştur
-0.06
değildir
-0.06
legacy
-0.06
_latency
-0.06
POSITIVE LOGITS
التح
0.07
thanking
0.07
ellation
0.07
cap
0.07
Pel
0.06
Pieces
0.06
Rolling
0.06
Inserts
0.06
ilarity
0.06
otten
0.06
Activations Density 0.001%