INDEX
Explanations
the word "bon" with varying activation levels
the term "bon" and its various contexts
New Auto-Interp
Negative Logits
Administ
-0.90
TY
-0.73
REAM
-0.69
ECT
-0.68
Proposition
-0.67
Clin
-0.65
AY
-0.65
natureconservancy
-0.65
ORT
-0.64
APS
-0.62
POSITIVE LOGITS
bon
1.44
etooth
1.05
neau
1.03
ilib
0.90
otaur
0.88
bons
0.88
fman
0.86
uses
0.86
amic
0.83
wich
0.83
Activations Density 0.004%