INDEX
Explanations
the term 'plants' or variations related to plant-based topics
references to plant-based topics and diets
New Auto-Interp
Negative Logits
xus
-0.78
dash
-0.75
DOS
-0.75
xes
-0.75
*/(
-0.74
cffffcc
-0.73
ROR
-0.73
deen
-0.70
nels
-0.68
ority
-0.67
POSITIVE LOGITS
ronics
1.01
plants
0.91
inct
0.85
ain
0.80
endon
0.78
atoes
0.78
Plants
0.77
ivity
0.77
anto
0.74
yard
0.73
Activations Density 0.034%