INDEX
Explanations
mentions of a specific location or position
occurrences of the word "here."
New Auto-Interp
Negative Logits
omore
-0.63
ONSORED
-0.57
Heights
-0.55
Medical
-0.55
Brach
-0.55
Shape
-0.54
hygiene
-0.53
ggle
-0.51
catentry
-0.51
cig
-0.51
POSITIVE LOGITS
tics
1.33
abouts
1.32
tical
1.22
tic
1.06
here
0.93
trak
0.77
newsp
0.69
orer
0.67
acus
0.63
lehem
0.63
Activations Density 0.041%