INDEX
Explanations
mentions of the term "abb", specifically at different activation levels
references to specific terms related to "Abbott."
New Auto-Interp
Negative Logits
piece
-0.72
ptives
-0.66
Cheong
-0.63
govtrack
-0.63
stanbul
-0.63
Gutenberg
-0.61
obser
-0.60
xia
-0.59
erosion
-0.59
stal
-0.59
POSITIVE LOGITS
ucket
1.03
itt
1.02
arella
0.98
itte
0.95
atical
0.90
its
0.90
atar
0.89
inic
0.86
ler
0.86
inical
0.86
Activations Density 0.029%